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PREFACE 


"The  search  for  excellence  continues 


One  of  the  most  fundamental  concerns  in  the  management  of  the  agri-food 
sector  pertains  to  the  quality  and  safety  aspects  of  food  and  agricultural  prod- 
ucts. This  concern  is  shared  equally  and  collectively  by  the  producer,  the 
consumer,  and  the  regulatory  authority,  and  they  all  endeavour  to  seek  methods 
of  ensuring  production  of  high  quality  food  products. 

The  achievement  of  excellence  in  food  production  requires  the  establishment  of 
an  effective  total  quality  management  system  that  systematically  lays  down 
optimally  precise  procedures.  Additionally,  effective  decision-making  within  the 
framework  of  any  quality  management  system,  whatever  its  nature,  type  or 
magnitude,  also  requires  an  analytical  ability  for  scientific  analysis  and  inter- 
pretation of  information  generated  by  the  system.  This  book  is  intended  to 
provide  some  guidelines  for  establishing  quality  assurance  procedures  as  well  as 
basic  statistical  tools  for  the  analysis  and  interpretation  of  data.  The  concepts 
and  techniques  are  presented  in  the  simplest  manner  and  are  expected  to  be  of 
value  to  anyone  engaged  in  the  management  of  quality  and  safety  of  food 
products. 

The  author  wishes  to  thank  Donna  McGovern  for  extensive  assistance  in  the 
preparation  of  the  book,  to  Roger  Trudel  for  providing  excellent  professional 
comments  and  to  Elise  Benoit  for  helping  us  in  the  design  work.  I  would  also 
like  to  express  my  profound  gratitude  to  my  wife  Shashi  and  my  two  daughters 
Pamela  and  Anu  for  their  patience  and  endurance. 

Ottawa,  Canada  Subhash  C.  Puri 


To  my  in-laws:  Ram  Parkash  and  Bimla  Sahney 


CHAPTER  1 
TOTAL  QUALITY  MANAGEMENT 


1.1     Introduction 

Food  is  fundamental  to  human  survival.  Its  quality,  therefore,  is  of  paramount 
importance  to  all  of  us.  We  must  all  endeavour  to  ensure  that  our  agri-food 
products  are  wholesome,  nutritious,  safe  and  risk-free.  This  goal  can  be 
achieved  by  setting  high  quality-safety  standards  and  instituting  effective  total 
quality  management  systems  to  ensure  their  conformance. 

There  are  generally  three  principal  parties  involved  in  the  agri-food  business: 
the  producer,  the  consumer,  and  the  regulatory  agency.  The  consumer  specifies 
the  quality/safety  requirements,  the  producer  undertakes  to  meet  them,  and  the 
regulatory  authority  confirms  that  the  stipulated  requirements  have  been  met. 
Each  party  has  an  important  role  to  play,  but  it  is  through  their  collective 
determination,  commitment  and  cooperation  that  excellence  in  food  quality  can 
be  achieved. 

The  basic  issues  relating  to  food  quality  and  safety,  critical  to  all  three  parties, 
include  the  following: 

•  Fitness  for  consumption 

•  Wholesomeness,  nutrition,  and  safety 

•  Conformance  to  specifications,  tolerances,  standards  and  regulations  set 
for  quantity,  quality,  and  contents 

•  Proper  use  and  declaration  of  pesticides,  insecticides,  herbicides,  fertil- 
izers, residues,  additives,  preservatives  and  other  chemicals 

The  responsibility  for  producing  high  quality  food  products  must  be  shared 
equally  by  the  three  parties.  For  example,  the  producer  has  the  following  three- 
fold responsibility: 

•  to  achieve  and  sustain  the  actual  quality  of  the  product  in  a  manner  that 
will  continually  satisfy  the  consumer's  expected  needs, 

•  to  provide  evidence/confidence  to  its  own  management  that  the  intended 
quality  is  being  achieved, 

•  to  provide  evidence/confidence  to  the  regulatory  authority  and  to  the  pur- 
chaser that  the  intended  quality  is  being  achieved  in  the  delivered  product. 
When  contractually  required,  this  provision  of  evidence/confidence  may 
involve  agreed  demonstration  of  compliance. 

The  regulatory  agency  has  a  dual  role  to  play:  assisting  producers  in  the  produc- 
tion of  high  quality  products  and  safeguarding  consumers  through  enforcement 
verification.  They  should,  therefore,  assist  the  producers  to  develop  and  establish 
an  internal  total  quality  management  system  and  establish  their  own  quality 
assurance  protocol  to  which  the  producer  is  expected  to  conform. 


The  consumer,  while  only  being  a  recipient  of  the  product,  must  attempt  to 
provide  effective  feedback  and  communication  to  the  producer  as  well  as  the 
regulatory  agency  regarding  the  quality,  quantity,  price,  safety  and  risk  aspects 
of  food  products. 

An  effective  total  quality  management  system  helps  to  apportion  responsibility 
and  direct  it  to  the  true  sources.  It  provides  a  systematic  mechanism  for  continu- 
ous quality  diagnosis  and  improvement.  Some  of  the  major  benefits  that  accrue 
from  a  properly  implemented  TQM  program  are:  consistency  and  uniformity  in 
the  application  of  procedures;  reduction  in  final  product  inspection;  verification 
and  audit;  increased  quality  and  credibility;  reduction  in  costs  associated  with 
appraisal,  detection,  prevention,  internal  failures,  external  failures,  waste,  non- 
conformity, and  inspection;  continuous  improvement  in  product  quality  and 
safety;  etc. 


1.2     Definitions  and  Terminology 

Total  Quality  Management  (TQM)  refers  to  the  totality  of  functions  necessary 
for  the  overall  management  of  food  quality  and  safety.  As  a  corporate  strategy 
for  continuous  quality  improvement,  it  provides  a  structured,  disciplined  ap- 
proach for  identifying  and  solving  problems  as  well  as  an  adaptable  framework 
for  institutionalizing  the  ensuing  improvements.  Total  quality  systems  are  refer- 
enced through  several  different  names:  TQM  (Total  Quality  Management),  TQC 
(Total  Quality  Control),  CWQC  (Company-wide  Quality  Control,  used  by  the 
Japanese),  etc.  A  TQM  system  has  basically  three  components: 

•  Quality  Management  (Q.M.):      management  functions 

•  Quality  Control  (Q.C.):  operational  techniques  and  activities 

•  Quality  Assurance  (Q.A.):  planned  actions  to  provide  confidence 

Following  is  a  list  of  definitions  of  some  quality-related  terms,  as  developed  in 
the  International  Standard,  ISO  8402:  "Quality  -  Vocabulary". 

Quality:  The  totality  of  features  and  characteristics  of  a  product 

or  service  that  bear  on  its  ability  to  satisfy  stated  or 
implied  needs. 

Note:  Quality  is  also  defined  as  "fitness  for  use",  "fitness 
for  purpose",  "conformance  to  the  requirements",  or 
"customer  satisfaction".  The  Japanese  Standards  As- 
sociation defines  it  as  "the  loss  imparted  to  society 
from  the  time  a  product  is  shipped".  Quality  is  a  state 
rather  than  a  specific  characteristic  of  an  entity.  As  a 
sum  total  of  all  the  characteristics  of  a  product,  it 
should  be  all-encompassing. 


Quality  Management:    That  aspect  of  the  overall  management  function  that 

determines  and  implements  the  quality  policy. 

Quality  Assurance:         All  those  planned  and  systematic  actions  necessary  to 

provide  adequate  confidence  that  a  product  or  service 
will  satisfy  given  requirements  for  quality. 

Quality  Control:  The  operational  techniques  and  activities  that  are  used 

to  fulfill  requirements  for  quality. 

Quality  System:  The  organizational  structure,  responsibilities,  pro- 

cedures, processes  and  resources  for  implementing 
quality  management. 


1.3     Developing  a  TQM  Program 

The  major  responsibility  in  producing  a  product  right  the  first  time  rests  with 
the  producer.  He  can  achieve  better  quality  by  introducing  effective  controls 
through  a  TQM  program.  An  internal  TQM  program  properly  implemented  and 
actively  operating  at  the  producer's  facility  can  reduce  the  need  for  extensive 
internal  final  product  inspection  or  external  verification  and  audit  by  a  reg- 
ulatory agency.  A  total  quality  management  system  is  a  vehicle  for  continuous 
quality  improvement. 

The  guidelines  for  developing  a  TQM  program  can  only  be  given  in  terms  of 
generic  elements.  These  elements  then  help  in  preparing  a  unique  TQM  pro- 
gram for  any  particular  enterprise  commensurate  with  its  specific  requirements. 
For  instance,  a  TQM  program  for  a  manufacturing  facility  must  incorporate  the 
following  components: 

•  Quality  Management  System:  includes  all  quality  management  aspects 

such  as  management  responsibility,  management  systems,  control  sys- 
tems, cost  systems,  evaluation  systems,  improvement  systems,  market 
analysis,  policy,  resource  allocation,  etc. 

•  Quality  Control:  comprises  all  critical  control  points  and  the  operational 

and  technical  aspects  of  controlling  quality  during  production. 

•  Procurement  Quality  Assurance:  quality  assurance  procedures  to  ensure 

supply  of  high  quality  input. 

•  Internal  Quality  Assurance:  in-house  assessment  procedures  to  assure 

that  the  final  product  is  of  high  quality. 

•  External  Quality  Assurance:  contractual  quality  assurance  protocol  to 

assure  the  regulatory  authority  and  the  purchaser,  or  ultimately  the 
consumer,  that  the  delivered  product  is  of  high  quality. 

A  basic  sequence  of  steps  in  the  development  of  a  TQM  program  includes  the 
following: 

•  Identify  the  situation/product/entity  for  which  a  TQM  program  needs  to  be 
established. 


•  Identify  the  goals  and  objectives. 

•  Prepare  an  exhaustive  list  of  all  the  activities  associated  with  the  situation. 

•  Categorize  the  activities  into  management,  systems,  control,  procedural, 
analytical,  evaluation,  verification,  audit,  action,  feedback,  improvement, 

etc. 

•  For  each  category,  list  all  the  requisite  action  items  for  that  category. 

•  For  each  action  item  within  a  category,  outline  the  detailed  instructions 
required  to  effect  the  action. 

•  The  completed  document  will  serve  as  a  TQM  protocol. 


1.4  TQM  Program  Elements 

A  generic  master  list  of  program  elements  from  a  producer's  perspective,  which 
is  generally  all-encompassing,  follows.  The  list  is  generic  in  nature  so  that  it  can 
be  conveniently  modified  and  utilized  to  devise  a  TQM  program  for  any  specific 
entity.  It  is  based  on  the  guidelines  developed  in:  (i)  National  Standard  of 
Canada,  CAN3-Z299:  Quality  Assurance  Program  -  Canadian  Standards  Asso- 
ciation, and  (ii)  International  Standard,  ISO/9004:  Quality  Management  and 
Quality  System  Elements  -  Guidelines. 

Elements  of  a  TQM  Program  -  Master  List 

•  Management  responsibilities:  policy,  objectives,  planning,  management 

system,  organizational  structure  and  responsibilities 

•  Structure  of  the  quality  system:  quality  responsibility  and  authority,  or- 

ganizational structure,  resources  and  personnel,  operational  procedures 

•  Documentation  of  the  quality  system:  quality  policies  and  procedures, 

quality  manual,  quality  plans,  quality  records 

•  Auditing  of  the  quality  system:  audit  plan,  conducting  the  audit,  report- 

ing of  audit  findings  and  follow-up 

•  Review  and  evaluation  of  the  quality  management  system 

•  Quality  related  cost  considerations:  selecting  appropriate  elements;  col- 

lection and  analysis  of  cost  data;  cost  categories:  detection,  appraisal, 
prevention,  internal  failure,  external  failure;  cost  reporting  to 
management 

•  Quality  in  marketing:  market  analysis,  product  brief,  customer  feedback 

information 

•  Quality  in  specification  and  design:  design  planning  and  objectives, 

product  testing  and  measurement,  design  qualification  and  validation, 
design  review,  design  baseline  and  production  review,  market  readiness 
review,  design  change  control,  design  requalification 


Quality  in  procurement:  requirements  for  specifications  and  purchase 
orders,  selection  of  qualified  suppliers,  agreement  on  quality  assur- 
ance, agreement  on  verification  methods,  provision  for  settlement  of 
quality  disputes,  receiving  inspection  planning  and  control,  receiving 
quality  records 

Quality  in  production:  planning  for  controlled  production;  process 
capability;  supplies,  utilities  and  environments 

Control  of  production:  material  control  and  traceability,  equipment  con- 
trol and  maintenance,  special  processes,  documentation,  process 
change  control,  control  of  verification  status,  control  of  nonconform- 
ing materials 

Product  verification:  incoming  materials  and  equipment,  in-process  in- 
spection, completed  product  verification 

Control  of  measuring  and  test  equipment:  measurement  controls,  ele- 
ments of  control,  supplier  measurement  controls,  corrective  action, 
outside  testing 

Nonconformity:  identification,  segregation,  review,  disposition,  documen- 
tation, prevention  of  recurrence 

Corrective  action:  assignment  of  responsibility,  evaluation  of  importance/ 
priority,  investigation  of  possible  problems,  analysis  of  problem,  pre- 
ventive measures,  process  control,  disposition  of  nonconforming 
items,  permanent  changes 

Handling  and  post-production  functions:  handling,  storage,  identifica- 
tion, packaging  and  delivery;  post-sales  service;  market  reporting  and 
product  supervision 

Quality  documentation:  specifications,  inspection  instructions,  test  pro- 
cedures, work  instructions,  quality  manual,  operational  procedures, 
quality  assurance  procedures,  etc. 

Quality  records:  inspection  reports,  test  data,  qualification  reports,  valida- 
tion reports,  audit  reports,  material  review  reports,  calibration  data, 
quality  cost  reports,  etc. 

Personnel:  training,  qualifications,  appraisal,  motivation 

Product  safety  and  liability:  suitable  safety  standards;  declaration  of 
quality,  quantity  and  content;  risk  warning  to  user;  product  traceability 
for  safety  assurance 

Use  of  statistical  methods:  market  analysis,  process  control,  con- 
formance/compliance level,  process  average,  data  analysis,  safety  eval- 
uation and  risk  analysis,  statistical  sampling  procedures,  quality  con- 
trol charts,  design  methodology,  performance  appraisal,  setting/ 
changing  of  tolerances,  etc. 


1.5  TQM  Program  for  a  Regulatory  Agency 

A  TQM  program  for  a  food  and  agriculture  regulatory  agency  can  be  likewise 
developed  by  selecting  appropriate  elements  from  the  above  master  list.  A 
generic  list  of  such  elements  could  include  the  following: 

•  Management  responsibilities 

•  Structure  of  the  quality  system 

•  Documentation  of  the  quality  system 

•  Monitoring/inspection/verification  plans 

•  Description  of  tolerances,  specifications  and  regulatory  requirements 

•  Quality  documentation  and  records 

•  Action  on  nonconformance 

•  Corrective  action 

•  Use  of  statistical  methods  for  information  analysis 

Furthermore,  for  each  of  these  main  elements,  a  sequential  list  of  sub-elements 
is  prepared,  for  each  of  which  a  set  of  detailed  instructions  is  also  provided. 
Some  of  the  essential  operational  considerations  to  be  incorporated  into  the 
sub-elements  include  the  following: 

•  To  determine  the  comparative  risk  level  associated  with  each  commodity. 

•  For  each  risk  level  or  category,  to  identify  a  suitable  frequency  of  inspec- 
tion level. 

•  For  each  risk  level,  production  volume  and  quality  status,  to  determine  a 
frequency  of  product  inspection. 

•  For  each  inspection  visit,  to  specify  the  sample  size  for  inspection  com- 
mensurate with  the  acceptable  quality  level  and  established  compliance 
rate. 

•  For  each  sampling  inspection  activity,  to  ensure  lot  homogeneity  and  the 
randomness/representativeness  of  sampling  procedures. 

1.6  Example:  Quality  Assurance  Program  for  Analytical 

Laboratories 

From  the  general  guidelines  and  program  elements  outlined  in  Section  1.4,  it  is 
relatively  easy  to  develop  a  quality  assurance  protocol  for  any  entity.  Consider 
as  one  such  application  the  establishment  of  a  quality  assurance  program  for 
an  analytical  laboratory.  As  a  first  step,  the  following  action  plan  should  be 
considered: 

•  Establish  a  profile  of  total  quality  assurance  requirements. 
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•  Evaluate  the  existing  laboratory  practices  with  respect  to  these  require- 
ments. 

•  Outline  the  precise  procedures  that  would  describe  how  the  quality  assur- 
ance requirements  are  to  be  applied  to  the  laboratory. 

•  Indoctrinate  and  train  the  laboratory  personnel  in  the  new  or  modified 
practices  and  procedures. 

•  Establish  a  management  system  of  periodic  assessment  to  ensure  that  the 
program  is  actually  effective. 

The  next  step  is  to  develop  a  list  of  quality  assurance  program  elements  that 
would  encompass  all  of  the  activities  of  a  laboratory's  operation.  Following  the 
guidelines  of  Section  1.4,  the  essential  elements  would  include  the  following: 

•  Quality  Assurance  Plan 

•  Policy  Statements 

•  Objectives 

•  Quality  Planning 

•  Quality  Costs 

•  Record  Keeping  and  Document  Control 

•  Chain  of  Custody  Procedures 

•  Quality  Training 

•  Procurement  and  Control 

•  Reagents  and  Reference  Standards 

•  Instrument  Calibration 

•  Preventive  Maintenance 

•  Sampling  -  Sample  Identification  and  Control 

•  Data  and  Methods  Validation 

•  Laboratory  Analysis  and  Control 

•  Interlaboratory  and  Intralaboratory  Testing 

•  Statistical  Quality  Control  Procedures 

•  Corrective  Action  Mechanism 

•  Safety  Procedures 

•  Laboratory  Design 

•  Performance  and  Systems  Audits 

•  Reports  to  Management 

Another  essential  feature  of  the  quality  assurance  protocol  is  the  establishment 
of  a  quality  assurance  manual.  This  manual  clearly  identifies  the  specific  meth- 
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ods  and  operating  procedures  that  the  laboratory  uses  to  satisfy  its  own  needs 
and  achieve  its  quality  objectives.  A  quality  assurance  manual  is  a  set  of  docu- 
ments intended  to  give  confidence  in  the  laboratory's  work.  The  manual  identi- 
fies the  policy,  organization,  objectives,  functional  activities  and  specific  quality 
assurance  activities  designed  to  achieve  the  quality  goals  set  out  for  the  opera- 
tion of  the  laboratory. 

A  typical  format  for  a  laboratory  quality  assurance  manual  would  appear  as 
follows: 

Title  page,  with  provisions  for  approved  signatures 

Table  of  contents 

Laboratory  organization  and  responsibilities 

Organizational  structure 

Quality  assurance  plan  and  objectives 

Quality  assurance  system 

Corrective  action 

Forms 

Quality  assessment  procedures 

Quality  assurance  reports  to  management 

Distribution  list 

1.7     Continuous  Quality  Improvement 

Quality  is  not  a  tactical  but  a  strategic  issue.  Quality  is  everyone's  business  and 
cannot  be  manufactured.  It  is  infused  and  embedded  into  a  product  through 
systematic  means.  In  brief,  quality  is  a  long-term  focus,  not  a  short-term  func- 
tion. Therefore,  to  realize  quality  goals,  it  is  imperative  to  first  establish  an 
effective  TQM  system  and  then  develop  a  quality  improvement  monitoring 
program.  Some  of  the  essential  components  and  elements  that  a  quality  im- 
provement program  should  encompass  are  as  follows: 

The  Program: 

•  Complete  and  active  management  commitment,  support,  and  participation 

•  Highly  visible,  action-oriented,  exhibiting  seriousness  of  intentions,  full 
participation  of  everyone  concerned 

•  Total  worker  immersion  and  awareness 

•  Worker  respect  and  recognition 

•  Customer-oriented  quality  control 

•  Long-term  strategic  focus 
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The  System: 

•  Quality  improvement  teams 

•  Project-by-project  quality  control 

•  Management  orientation  and  training 

The  Methodologies: 

•  Use  of  statistical  tools  for  management  decision-making:  frequency  dis- 

tribution, histogram,  cause-effect  diagram,  Pareto  analysis,  economic 
cost  analysis,  etc. 

•  Formal,  structured,  disciplined  approach:  task  analysis,  problem  analysis, 

root  cause  identification,  corrective-preventive  action  procedure,  etc. 

•  Statistical  process  control 

A  typical  check-list  of  questions,  commensurate  with  the  program  elements 
described  above,  should  be  formulated  on  the  following  lines: 

•  Were  the  goals  and  objectives  clearly  disseminated  and  understood  by 
everyone  concerned? 

•  Were  the  assigned  roles  and  responsibilities  appropriate,  clearly  defined, 
well  understood  and  accepted? 

•  Were  the  allocated  resources  (human,  financial,  technological)  suitable  and 
optimal? 

•  Are  there  any  deviations  from  the  expected  results? 

•  Who  is  to  be  held  accountable  for  the  deviations? 

•  What  type  of  action  is  required  to  achieve  the  expected  results  and  move 
towards  further  improvement? 

•  Can  a  system  be  instituted  to  automatically  check  the  periodic  status  of  the 
program? 

As  a  measure  of  program  performance,  the  following  procedure  is  recom- 
mended: 

•  Assign  a  project  team;  clarify  the  problem;  establish  theories  and  dominant 
causes;  develop  corrective  action. 

•  Implement  and  communicate  action  plan. 

•  Select  an  issue;  identify  the  characteristics  to  be  measured;  collect  perti- 
nent data. 

•  Analyze  data  through  appropriate  statistical  methods. 

•  Test  results;  measure  progress;  confirm  removal  of  dominant  causes  as 
planned. 

•  Standardize;  establish/revise  procedures;  review  other  problems. 
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1.8     Productivity  Measurement  and  Improvement 

Productivity  and  quality  are  inseparable  concepts.  We  must  measure  productiv- 
ity as  our  ability  to  provide  high-value  products  and  services  that  meet  customer 
requirements  at  a  minimum  cost.  The  priorities  are  safety,  quality,  and  cost. 

Productivity  can  be  defined  as  the  quality,  timeliness,  and  cost-effectiveness 
with  which  an  organization  achieves  its  mission.  It  is  a  measure  of  how  well 
resources  are  brought  together  and  utilized  in  accomplishing  a  set  of  results. 

Productivity  and  production  are  not  the  same  thing.  Greater  production  does  not 
necessarily  mean  greater  productivity.  Whereas  productivity  is  concerned  with 
the  effective  utilization  of  resources,  production  is  concerned  with  the  process 
and/or  the  methodologies  of  producing  goods  and  services.  A  productive  plant 
is  one  that  has  a  large  production  volume  relative  to  the  amount  of  material, 
energy,  labour,  capital  and  other  resources  consumed. 

Productivity  is  a  combination  of  effectiveness  and  efficiency  and  is  expressed  as 
an  index: 

Resource  Output        Effectiveness 

Productivity  Index   =  — : =      —  . 

Resource  Input  Efficiency 

Effectiveness  relates  to  how  well  an  objective  is  reached  and  is  concerned  only 
with  the  achievement  of  desired  results  without  serious  regard  for  the  costs 
involved.  Efficiency  refers  to  how  well  the  available  resources  are  utilized  in 
achieving  the  stated  results  and  is  concerned  with  the  total  cost  of  all  the  inputs 
involved. 

A  more  elaborate  index  of  productivity  advocated  by  Craig  and  Harris  (1973)  is 
given  as  follows: 

Pt   =   Ot/(Lt   +   Ct   +  Rt  +   Qt) 

where 

Pt    =    productivity  measurement  for  period  t 

Ot  =    total  output  of  good  units  produced  in  period  t 
(measured  in  deflated  or  base-year  dollars) 

Lt,  Ct,  Rt,  Qt  =  base-year  dollar  value  of  all  labour,  capital,  raw 
material,  and  miscellaneous  goods  and  services 
consumed  in  period  t,  respectively 

To  improve  productivity,  the  productivity  index  must  be  monitored  and  ana- 
lyzed. This  analysis  is  carried  out  with  the  help  of  control  chart  methods 
described  in  Chapter  5. 
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1.9     Diagnostic  Methods:  Cause-Effect  Diagrams 

A  process  is  a  set  of  conditions  or  a  system  of  causes  which  work  together  to 
produce  a  given  result  or  an  effect.  Most  generally,  the  causes  relate  to  the  six 
M's:  Material,  Machine,  Man,  Method,  Money,  and  Management.  An  effective 
way  of  studying  the  relationship  between  causes  and  an  effect  is  through  the  use 
of  cause-effect  (C-E)  diagrams  developed  by  Japanese  Professor  Kaoru 
Ishikawa  in  1950.  This  diagram  is  also  known  by  other  names,  such  as  brain 
storming  diagram  or  fishbone  diagram.  A  basic  C-E  diagram  is  shown  in 
Figure  1.1.  The  diagram  serves  as  a  diagnostic  tool  to  recognize  the  problem  or 
effect,  identify  all  the  causes,  evaluate  operational  procedures,  identify  solutions 
to  correct  the  problem,  and  help  in  process  quality  improvement. 

Construction  of  C-E  Diagram 

•  Identify  the  quality  characteristics  or  effects  for  which  causes  are  to  be 
found. 

•  Draw  a  horizontal  line  with  an  arrow  at  the  right  end  and  a  box  in  front  of 
it  indicating  the  effect  or  problem. 

•  Write  the  main  factors  or  causes,  i.e.,  the  six  M's:  Material,  Man,  Method, 
Machine,  Money,  Management,  joining  each  of  these  by  a  slanted  arrow 
directed  to  the  horizontal  arrow. 

•  Add  twigs  with  directed  arrows  to  identify  sub-causes  to  the  main  cause. 
Proceed  similarly  in  adding  sub-sub-causes. 

•  Ensure  that  all  possible  causes  of  priorities  are  taken  into  account. 

•  The  completed  graph  or  chart  is  a  C-E  diagram. 


Problem 


Effect 


Figure  1.1     Cause-Effect  Diagram  Identifying  Major  Causes 
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As  an  example,  a  complete  schematic  of  a  C-E  diagram  is  presented  in  Figure 
1.2  for  a  problem  associated  with  'cap  torque  defects'  on  the  processing  line  of 
a  food  processing  plant.  Once  the  defect  had  been  identified,  a  comprehensive 
search  was  conducted  for  all  possible  causes  which  were  diagnosed  with  the 
help  of  the  C-E  diagram.  It  was  then  quite  easy  to  identify  and  prioritize  the 
causes,  allowing  prompt  and  effective  management  decision-making  for 
corrective/preventive  action. 
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Figure  1.2     Cause-Effect  Diagram:  Cap  Torque  Defects 
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1.10     Diagnostic  Methods:  Pareto  Analysis 

Pareto  analysis  is  another  useful  diagnostic  technique,  supplementing  the 
method  of  cause-effect  diagrams,  which  helps  in  prioritizing  and  analyzing 
causes  of  nonconformity.  The  principal  idea  behind  this  technique  rests  on  the 
premise  that  all  events  or  causes  are  not  uniformly  distributed  as  far  as  their 
effects  are  concerned.  They  are  maldistributed  in  the  sense  that  relatively  few  of 
the  causes  are  responsible  for  most  of  the  effects.  The  Pareto  principle  calls  it 
the  80-20  phenomenon,  implying  that  often  about  80%  of  the  nonconformities 
result  from  only  20%  of  the  causes. 

The  original  concept  of  the  Pareto  distribution  was  developed  by  Vilfredo 
Pareto,  a  nineteenth-century  Italian  sociologist-economist,  with  regard  to  the 
maldistribution  of  wealth  and  income.  He  suggested  that  80%  of  the  wealth  in  a 
country  is  normally  controlled  by  20%  of  the  people.  This  idea  was  extended  to 
quality  control  applications  by  Juran  in  1964.  Juran  suggested  sorting  out  the 
causes  of  nonconformity  into  two  categories,  the  'vital  few'  and  the  'trivial 
many',  and  then  prioritizing  data  so  as  to  take  corrective  action  on  the  vital 
few  causes  which  contribute  to  the  major  losses  due  to  nonconformity. 

Construction  of  Pareto  Diagram 

•  List  all  the  essential  process  elements  of  interest. 

•  Decide  on  the  mode  of  data  classification,  i.e.,  defect  type,  part  number, 
shift,  operation,  etc. 

•  Select  an  appropriate  time  period  and  collect  all  pertinent  data. 

•  For  each  category,  record  the  total  frequency  of  occurrence. 

•  Order  the  elements  according  to  this  measure,  not  their  classification. 

•  Plot  a  frequency  bar  graph,  beginning  on  the  left  with  the  category  of 
highest  frequency  and  moving  to  the  right  with  categories  of  successively 
lower  frequency.  An  effective  diagram  has  five  to  six  categories. 

•  On  the  same  diagram,  plot  the  line  graph  for  the  cumulative  frequency  of 
each  category. 

•  Add  a  title  and  legend  to  the  graph. 

•  Take  corrective  action  on  the  'vital  few'. 


Example  1.1     Pareto  Analysis 

Consider  a  processing  operation  producing  the  product  'Mayonnaise'  in  a  food 
processing  plant.  The  year-end  analysis  of  the  cost  figures  revealed  heavy  losses 
due  to  quality  failures.  Table  1.1  lists  the  various  cost  categories  in  rank  order 
and  the  corresponding  Pareto  diagram  is  shown  in  Figure  1.3. 
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TABLE  1.1:  Losses  Due  to  Quality  Failures 


Rank 

Category 

Cost 

($) 

Percent 
Cost 

1 

2 
3 
4 
5 
6 
7 

Line  Downtime 

Container  and  Closure  Waste 

Spillage 

Batch  Adjustment 

Damage  due  to  Material  Handling 

Reblend 

Customer  Complaint  Adjustments 

30,000 
18,000 
15,000 
5,000 
4,000 
3,000 
1,000 

39.6 
23.7 
19.7 
6.5 
5.3 
3.9 
1.3 

Total 

76,000 

100.0 

o 
o 
o 


o 
U 


40  ■■ 


30 


20 


10 


Quality  Failures  (Rank) 


Vital  Few 


Trivial  Many. 


H 


Figure  1.3     Pareto  Diagram  for  Data  in  Table  1.1 
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As  can  be  seen  from  this  analysis,  the  major  cause  of  losses  seems  to  be  the 
average  line  downtime.  This  category  can  be  further  analysed  by  identifying  and 
prioritizing  the  causes  as  given  in  Table  1.2  and  the  corresponding  Pareto 
diagram  in  Figure  1.4. 

TABLE  1.2:  Average  Downtime  (minutes  per  shift)  on  Packing 
Line  (3-week  period) 


Rank 

Category 

Downtime 
(mins.) 

Percent 
Downtime 

Cumulative 
%  Downtime 

1 

2 
3 
4 
5 
6 
7 
8 
9 
10 

Case  Packer  Problem 
Labeler  Problem 
No  Glass 
No  Product 
No  Caps 
Capper  Problem 
Broken  Glass 
Glue  Pot  Empty 
Glue  Condition 
Case  Taping 

58 

47 

39 

12 

10 

9 

8 

6 

3 

1 

30.1 
24.4 
20.2 
6.2 
5.2 
4.7 
4.1 
3.1 
1.5 
0.5 

30.1 
54.5 
74.7 
80.9 
86.1 
90.8 
94.9 
98.0 
99.5 
100.0 

Total 

193 
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1 

B 

c 

o 
Q 


r 

/ 

/ 

/ 
/ 

90- 

/ 
/ 
/ 

_ 

/ 

60- 

i                                       Pnint  nf  DiminkhinP 

Returns 

30- 

/ 

n 

i — ■ — . — — 1 

100% 


-80% 


L-60% 


£ 

o 
Q 


■40%      I 


23456789 

Categories  (Rank) 
Figure  1.4     Pareto  Diagram  for  Data  in  Table  1.2 
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From  the  above  analysis,  it  is  now  easy  for  management  to  make  a  decision  on 
which  aspect  of  failure  should  be  tackled  first.  Since  it  is  not  cost-effective  to 
try  to  prevent  all  defects,  Pareto  analysis  helps  in  effectively  prioritizing  our 
strategy  for  corrective  action.  The  real  challenge  for  quality  management,  how- 
ever, still  lies  in  preventing  defects  from  ever  occurring  in  the  first  place. 
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CHAPTER  2 
BASIC  STATISTICAL  METHODS 


2.1  Introduction 

Every  scientific  investigation  yields  information  and  numerical  data.  Repeatedly 
scanning  through  individual  records  of  raw,  unorganized  data  is  generally 
tedious .  There  is  a  need  for  brevity  in  the  description  and  summary  of  results  for 
effective  management  decision-making.  This  is  accomplished  through  statistical 
methods  which  provide  scientific  tools  for  organizing,  summarizing,  analyzing, 
and  interpreting  the  results. 

From  a  statistical  viewpoint,  each  value  in  a  group  of  measurements  (sample)  is 
considered  as  only  one  realization  of  a  hypothetical,  infinite  population  of 
similar  measurements.  Although,  in  general,  all  members  of  this  sample  refer  to 
measurements  of  the  same  property  on  the  same  population  or  lot,  they  are  not 
expected  to  be  identical.  The  differences  among  them  are  attributable  to  chance 
factors  as  well  as  a  multitude  of  other  assignable  factors  associated  with  the 
measuring  process.  The  fundamental  aim  of  statistics,  therefore,  is  to  identify 
these  causes  of  variation,  evaluate  their  significance,  and  ultimately  provide 
means  to  make  inferences  from  a  sample  to  a  population. 

Some  of  the  questions  most  frequently  asked  during  a  study,  survey  or  experi- 
ment include: 

•  how  to  design  a  statistically  valid  experiment 

•  what  sample  size  to  take  so  that  it  will  be  cost-effective,  feasible,  statis- 
tically valid,  and  sufficient  to  provide  reliable  estimates  with  an  effective 
decision  criterion 

•  how  to  ensure  a  sample's  homogeneity,  randomness  and  representativeness 
of  the  lot  or  population 

•  how  to  estimate  population  parameters  from  sample  statistics  with  a  high 
degree  of  confidence 

•  how  to  study  differences  between  several  sample  results 

•  how  to  carry  out  regression  and  correlation  analysis  for  forecasting 

•  how  to  establish  control  procedures  to  achieve  consistency,  uniformity, 
repeatability  and  reproducibility  of  results 

2.2  Definitions 

Population  or  Lot:  the  total  group  of  units  under  consideration,  the  group  to 
which  the  results  are  to  be  generalized. 
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Sample:  a  portion  of  the  population.  A  sample  should  be  representative  of  the 
population  and  be  chosen  in  a  random  fashion.  A  simple  random  sample  is 
one  that  has  been  selected  by  a  random  process  such  that  each  unit  in  the 
population  has  an  equal  and  independent  chance  of  being  included  in  the 
sample. 

Kinds  of  Data: 

•  Discrete:  where  the  variable  can  assume  only  specific  values  (usually 

integers)  and  involves  counting,  e.g.,  number  of  cows  on  a  farm, 
count  of  items  not  meeting  a  grade,  etc.  When  characteristics  such  as 
these  are  dichotomous  (i.e.,  defective-nondefective),  they  are  called 
attributes. 

•  Continuous:  data  resulting  from  a  measurement  or  other  numerical  es- 

timation procedure;  these  characteristics  yield  variables,  e.g.,  tem- 
perature readings,  pH  values,  crop  yield,  etc. 

2.3     Organization  of  Data 

Statistical  data  from  a  scientific  study  usually  consists  of  a  large  number  of 
observations.  To  obtain  meaningful  information,  this  unorganized  set  of  values 
must  be  concisely  summarized,  described,  and  presented.  The  common  visual 
technique  for  presenting  such  data  is  the  histogram  or  bar  graph.  For  discrete 
data,  these  graphs  are  generally  not  difficult  to  construct.  Continuous  data  such 
as  weight,  temperature,  pressure,  length,  and  percentage  dry  matter  are  not 
already  grouped  into  natural  categories  and,  consequently,  must  first  be  ar- 
ranged into  some  convenient  grouping.  This  is  done  through  a  frequency  table, 
whereby  the  range  of  the  data  is  divided  into  a  reasonable  number  of  categories 
and  each  observation  is  assigned  to  exactly  one  of  these  categories.  The  number 
of  categories  is  arbitrary  but  a  good  rule  of  thumb  is  to  let  k  =  Vn,  where  k  is 
the  number  of  categories  and  n  the  number  of  observations.  If  R  is  the  range  of 
the  data  (range  =  largest  observation  minus  smallest  observation),  then  the 
width  of  each  category  is  approximately  R/k.  For  simplicity,  we  will  only 
consider  situations  in  which  all  categories  have  the  same  width. 

Example  2.1     Organization  of  Data 

Table  2.1  gives  the  moisture  content  (%)  of  skim  milk  powder  obtained  through 
the  laboratory  analysis  of  50  independent  samples. 


TABLE  2.1: 

Percent  Moisture  in 

Skim  Milk  Powder 

3.4        2.9 

4.6 

3.9 

3.5 

2.8 

3.4 

4.0 

3.1 

3.7 

3.5        3.1 

2.5 

4.4 

3.7 

3.2 

3.8 

3.2 

3.7 

3.2 

3.6        3.0 

3.3 

4.0 

3.4 

3.0 

4.3 

3.8 

3.8 

3.6 

3.4        2.7 

3.5 

3.6 

3.6 

3.3 

3.7 

3.5 

4.1 

3.1 

3.7         3.2 

3.9 

4.2 

3.5 

2.9 

3.9 

3.6 

3.4 

3.3 
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The  number  of  categories  should  be  approximately  V50,  or  k  =  7.  The  range 
of  the  data  is  R  =  4.6  -  2.5  =  2.1.  The  consecutive  categories,  each  of 
width  R/k  =  2.1/7  =  0.3,  are  then  2.5  -  2.8,  2.8  -  3.1,  ...,4.3  -  4.6. 
As  a  convention,  any  observation  falling  on  the  border  of  two  categories  will  be 
put  in  the  higher  one  of  the  two.  The  frequency  table  may  be  presented  as  in 
Table  2.2. 


TABLE  2.2: 


Frequency  Table  for  Percent  Moisture  in  Skim  Milk 
Powder 


Class 
Boundaries 


Class 
Midpoint 

(X) 


Class 
Frequency 

(f) 


Cumulative 
Frequency 


2.5   -   2.8 

2.65 

2 

2.8   -   3.1 

2.95 

5 

3.1    -   3.4 

3.25 

10 

3.4   -   3.7 

3.55 

15 

3.7   -   4.0 

3.85 

11 

4.0   -  4.3 

4.15 

4 

4.3   -  4.6 

4.45 

3 

2 
7 
17 
32 
43 
47 
50 


Once  the  data  have  been  arranged  into  a  frequency  table,  they  may  be  presented 
in  a  histogram  (see  Figure  2.1)  by  plotting  the  class  frequency  (on  the  vertical 
axis)  against  its  boundaries  (on  the  horizontal  axis).  One  assumes  that  all  the 
observations  in  any  category  now  adopt  the  class  midpoint  as  their  value. 
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Figure  2.1     Histogram  for  Data  in  Table  2.2 
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2.4     Statistical  Measures 

Whether  data  is  raw  and  ungrouped  or  summarized  and  grouped  in  a  frequency 
table,  large  sets  of  data  can  be  uninformative.  Data  needs  to  be  characterized, 
especially  for  comparison  purposes  and  decision-making.  Typical  statistical 
summaries  include  measures  of  location  (or  center)  and  measures  of  dispersion 
(or  spread  from  the  center). 


2.4.1     Mean 

The  mean  provides  the  most  commonly  used  statistical  measure  of  location.  For 
a  sample  of  observations  X1?  X2,  ...,  Xn,  the  sample  arithmetic  mean  (or 
sample  mean  or,  simply,  the  mean)  is  denoted  by  X  where 

n 

2  x.; 

X,    +  X2  +   ...    +  Xn         1  =  1 
X   =  — = 


The  corresponding  population  mean  is  denoted  by  the  Greek  letter  \x  (mu). 

For  grouped  data,  where  the  class  midpoints  Xlt  X2,   ...,  Xk  occur  with 
respective  frequencies  f1?  f2,  ...,  fk,  the  mean  is  defined  as 

k 

_        X,f,   +  X2f2  +   ...   +  Xkfk        i?,5 
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X  = 


fj  +  f2  +  ...  +  f] 


k 


k 
i=  1 


2.4.2     Range 

The  range  for  a  set  of  observations  is  the  positive  difference  between  the  largest 
and  smallest  observations.  For  a  frequency  table,  the  range  is  the  positive 
difference  between  the  upper  boundary  of  the  highest  class  and  the  lower 
boundary  of  the  lowest  class. 


2.4.3    Variance  and  the  Standard  Deviation 

The  extent  of  variability  in  a  data  set  is  effectively  estimated  by  calculating  the 
statistical  measure  of  dispersion  known  as  the  variance.  For  a  sample  of  n 
observations  X1}  X2,  ...,  Xn,  the  sample  variance  is  denoted  by  s2  where 


2  i=1 

S2    =   


2    (X,  -  X)2  XX,2  -   -£££ 


n  -    1  n  -   1 
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The  corresponding  population  variance  is  denoted  by  o2  (sigma  squared), 
where  the  denominator  n  -  1  is  replaced  by  the  population  size. 

The  square  root  of  the  variance  is  called  the  standard  deviation  and  is  denoted 
appropriately  by  s  or  a.  In  the  case  of  a  grouped  frequency  distribution,  where  a 
set  of  class  midpoints  Xv  X2,  ...,  Xk  have  the  respective  frequencies  f1? 
f2,  . . . ,  fk,  the  variance  is  given  by 

k  ClXf)2 

X    (Xi  -  X)2fj        SXi2fi  - 


s2    = 


i=i       '  '    l  x    l  It 


k  If-   -   1 

S£  -  1 


i  =  i 
2.4.4     Coefficient  of  Variation 

It  is  frequently  difficult  to  make  a  comparison  between  two  or  more  sets  of  data 
expressed  in  different  units  of  measurement.  This  is  accomplished  by  a  measure 
known  as  the  coefficient  of  variation,  given  by 

_  Standard  Deviation         s 

C.V.    =   =  —   x    100%  . 

Mean  X 

The  C.V.,  expressed  as  a  percentage,  provides  a  comparison  of  the  average 
variability  to  the  mean  in  a  data  set  and  is  unit  free.  A  small  percentage 
indicates  a  rather  homogeneous  group  of  observations. 


Example  2.2    Statistical  Measures 

Using  the  data  in  Table  2.1,  find  the  mean,  range,  variance,  standard  deviation, 
and  coefficient  of  variation  for  moisture  content  (%)  in  skim  milk  powder  for 
the  50  observations  given.  For  the  raw  data, 

_        3.4  +   3.5   +   ...   +  3.3         175.5 
Mean   =   x   =  _ =  _  =   3.51 

Range   =   4.6   -   2.5   =   2.1 

SXt   =    175.5,  SXj2   =   625.37 

625.37   -  ^^-2 
50 

Variance   =   s2   = =   0.1911 

50  -    1 

Standard  Deviation   =   s   =   V0.1911   =   0.4372 
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Coefficient  of  Variation   =   C.V.    =  =    x    100% 

0.4372 
=  -T77-    x    100%   =    12.46% 

For  the  grouped  data  (see  Table  2.2), 

(2   x   2.65)   +   (5   x   2.95)  +   . . .   +  (3   x  4.45) 


Mean   =   X 


2  +   5   +    ...    +   3 


178.1 


-    3.56 


50 

Range   =   4.6   -   2.5    =   2.1 

2X&   =    178.1,     2Xi2fi   =   643.565 

(178. 1)2 

643.565   - - 

50 

Variance   =   s2   =   =   0.1872 

50   -    1 

Standard  Deviation   =   s   =   V0.1872   =   0.4327 

Coefficient  of  Variation   =  C.V.   =  =   X    100% 

A. 

0.4327 


3.56 


x    100%   =    12.15% 


2.5     Probability 

There  are  two  approaches  to  defining  probability:  the  classical  and  the  relative 
frequency.  According  to  the  classical  approach,  if  a  procedure  gives  rise  to  n 
equally  likely  outcomes,  of  which  r  have  attribute  A,  then  the  probability  of  A  is 
r/n.  This  definition  is  somewhat  restrictive  since  to  calculate  any  probability  we 
need  to  know  the  value  of  n  and  be  sure  that  each  outcome  is  equally  likely.  Since 
most  food  and  agricultural  problems  do  not  satisfy  these  requirements,  a  more 
pragmatic  definition  of  probability,  called  the  frequency  concept  of  probability,  is 
used.  We  shall  interpret  probability  as  a  relative  frequency  in  a  large  number  of 
trials,  i.e.,  when  we  talk  of  the  probability  of  an  event  A,  we  mean  the  relative 
frequency  of  A  in  a  large  number  of  similar  trials.  If  an  event  of  interest,  A, 
occurs  a  times  in  b  trials  and  if  the  ratio  a/b  approaches  a  limit,  r/n,  as  b  becomes 
arbitrarily  large,  then  r/n  is  called  the  probability  of  A  and  we  write  P(A)   =  r/n. 

One  sees,  from  either  definition  of  probability,  that  P(A)  5*  0,  P(A)  ^  1,  and,  if  A 
represents  the  complement  of  A,  then  P(A)  =  1  -  P(A).  Note  that,  if  an  event  A 
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is  impossible,  then  r  =  0  and  P(A)  =  0.  If  an  event  A  is  certain  (i.e. ,  it  occurs  at 
every  trial),  then  r  =  n  and  P(A)  =  1. 

If,  for  example,  a  sample  of  200  oranges  is  inspected  from  a  large  consignment 
and  10  are  found  to  be  diseased,  then  the  proportion  of  diseased  oranges  (or  the 
relative  frequency  of  diseased  oranges)  in  the  consignment  is  estimated  as  10/200. 
Thus,  the  probability  that  an  orange  is  diseased,  written  as  P  (an  orange  being 
diseased),  is  0.05. 

2.6     Permutations  and  Combinations 

For  a  set  of  n  items,  any  arrangement  of  r  of  them  in  a  definite  order  is  called  a 
permutation.  The  number  of  different  permutations  is  denoted  by  nPr  and 
calculated  by  the  following  formula: 


nP_   = 


n 


(n  -  r)! 

where  n!  =  n(n  —  l)(n  —  2). .  .1,  (n  —  r)!  =  (n  —  r)(n  —  r  —  1)  ...  1,  and 
0!  is  taken  to  be  1.  For  example,  3!   =   3-2-1  =  6. 

For  a  set  of  n  items,  any  subset  of  r  of  them  (chosen  without  regard  to  their  order 
of  selection)  is  called  a  combination.  The  number  of  different  such  combinations 
is  denoted  by  (")  and  calculated  by  the  following  formula: 

o  = 


r!  (n  -  r)! 


If  we  consider  n  repeated  trials,  each  with  two  possible  outcomes  (e.g. ,  defective 
and  nondefective),  then  the  total  possible  number  of  different  arrangements  or 
sequences  that  can  be  obtained,  each  having  x  defectives  and  n  —  x  non- 
defectives,  is  (n). 


2.7     Statistical  Distributions 

Any  set  of  data,  whether  discrete  or  continuous,  exhibits  a  distributional  pattern. 
The  analysis  of  a  data  set  becomes  easier  if  it  belongs  to  a  distribution  whose 
properties  are  known.  The  three  main  probability  distributions  that  deal  with 
counting  or  discrete  data  are  the  binomial,  Poisson  and  hypergeometric.  For 
measured  or  continuous  data,  probabilities  are  generally  calculated  by  using  the 
normal  distribution. 


2.7.1     Binomial  Distribution 

If  p  is  the  probability  that  any  item  is  defective,  then  the  probability  that  in  a 
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random  sample  of  n  items  there  will  be  x  defectives  is  given  by  the  binomial 
distribution  whose  formula  is  as  follows: 

P  (x  defectives  among  n  items)  =   (")  px(l  -  p)n_x  , 

for  x   =  0,  1,  2,  . . .,  n. 

Example  2.3     Binomial  Distribution 

A  labelling  process  is  known  to  produce  20%  defective  items.  What  is  the 
probability  of  finding  two  defectives  in  a  sample  of  four  items? 

Here,  n   =  4,  x   =   2,  p  =  0.2     . 

P  (2  defectives)   =   (*)  (0.2)2  (0.8)2 
=   0.1536     . 
2.7.2     Poisson  Distribution 

In  the  case  of  a  rare,  random  event,  where  n  is  large  and  p  is  small,  probabilities 
are  calculated  by  using  the  Poisson  distribution  as  follows: 

e_x  Xx 

P  (x  defectives  among  n  items)   =  > 

x! 

where  X  =  np  and  e  =  2.718.  The  approximation  of  binomial  probabilities, 
using  the  Poisson  distribution,  is  generally  adequate  if  n  is  larger  than  20  and  p 
is  smaller  than  0.05.  If  n  is  larger  than  100,  then  p  may  be  as  large  as  0.1. 


Example  2.4     Poisson  Distribution 

Reconsider  the  above  example  of  the  labelling  process.  Suppose  that  the  propor- 
tion of  defectives  is  5%  and  every  hour  a  sample  of  forty  items  is  taken.  What  is 
the  probability  of  finding  in  a  sample  one  defective  item? 

Here,  n  is  large  (40),  p  is  small  (0.05),  and  k  =  np  =  2.  Using  the  Poisson 
distribution, 

e-2   x   21 
P  (1  defective)   =  r: =   0.2707  . 


2.7.3     Hypergeometric  Distribution 

To  calculate  probabilities  involving  samples  from  small  populations,  one  uses 
the  hypergeometric  distribution.  If  the  population  contains  N  units  of  which  X 
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are  defective  and  a  sample  of  n  units  is  randomly  selected,  the  probability  of 
finding  x  defective  units  in  the  sample  is  given  by: 


*/  N 


X 


P  (x  defectives)   = 


for  x   =   0,  1,  2,  . . .,  n. 


Q(n - x) 


Example  2.5     Hyper  geometric  Distribution 


If  a  population  consists  of  twenty  items,  of  which  two  are  defective,  and  a 
sample  of  five  items  is  selected  for  examination,  what  is  the  probability  of  the 
sample  containing  one  defective  item? 

Here,  X   =   2,  x   =    1,  N  =   20  and  n  =  5. 


P  (1  defective)   = 


(jjQ 


(?) 


=   0.3947 


2.7.4     Normal  Distribution 


The  most  important  distribution  dealing  with  continuous  or  measured  data  is  the 
normal  distribution,  whose  formula  is: 


P  (X)   = 


1 


e-i/2  {(X   -   M-)/o"}2  ,  for  all  values  of  X, 


o   v  2tt 

where  jjl  and  cr  are  the  population  mean  and  standard  deviation.  Figure  2.2 
shows  the  graph  of  a  normal  distribution.  The  probability  that  a  variable  X, 
which  has  a  normal  distribution  with  mean  (x  and  standard  deviation  o,  lies 
between  two  values  a  and  b  is  the  area  under  the  distribution  curve  between 
a  and  b. 

P(X) 


(1  a  b 

Figure  2.2     A  Typical  Normal  Distribution 
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Some  important  features  of  the  normal  curve  include  the  following: 

1.  Once  jjl  and  o  are  specified,  the  normal  curve  is  completely  determined. 

2.  The  curve  is  symmetrical  about  a  vertical  axis  through  the  mean.  The  obser- 
vations tend  to  cluster  about  the  mean. 

3.  The  total  area  under  the  curve  is  equal  to  1. 

4.  Although  the  curve  extends  indefinitely  in  both  directions,  for  all  practical 
purposes,  there  is  negligible  area  beyond  3o  on  either  side  of  the  mean. 
Empirically,  the  following  is  known  (see  Figure  2.3): 

68.26%  of  the  area  is  encompassed  between  (jl  ±  lcr, 

95.44%  of  the  area  is  encompassed  between  (jl  ±  2o, 

99.73%  of  the  area  is  encompassed  between  jjl  ±  3a. 

The  normal  curve,  being  fully  dependent  on  |jl  and  o,  changes  shape  and 
location  with  different  values  of  ^  and  a,  thereby  generating  an  infinite  family 
of  distributions.  It  would  be  a  hopeless  task  to  set  up  individual  tables  of 
normal  probabilities  or  areas  for  every  conceivable  combination  of  values  for  ^ 
and  a.  Fortunately,  it  is  not  necessary  to  do  so.  We  transform  the  normally 
distributed  random  variable  X  with  mean  |jl  and  standard  deviation  a  to  a  new 
random  variable,  called  Z,  where 


Z    = 


X  -  u. 


a 


which  has  a  standard  normal  distribution  with  |x  =  0  and  cr  =  1.  The  curve 
of  the  standard  normal  distribution  is  centered  at  zero  and  has  the  bulk  of  its 
area  between  -3  and  3.  The  probability  that  Z  lies  between  any  two  numbers 
a  and  b  (assuming  that  a  <  b)  may  be  evaluated  from  Appendix  Table  1. 


.003% 


Figure  2.3     Percentages  of  the  Normal  Distribution 


30 


Example  2.6     Normal  Distribution 

Assuming  that  the  moisture  content  of  skim  milk  powder  has  a  normal  distribu- 
tion with  mean  fx  =  3.5 1(%)  and  standard  deviation  a  =  4372  (%),  what  is  the 
probability  that  the  moisture  content  of  a  randomly  chosen  sample  of  skim  milk 
powder  will  be  between  2.9  and  3.8(%)? 


When  X   =    2.9,  Z   = 


When  X   =    3.8,  Z 


2.9 

— 

M- 

a 

3.8 

— 

^ 

2.9   - 

-    3.51 

0.4372 

3.8   - 

-   3.51 

1.40 


a 


0.4372 


=   0.66 


The  probability  that  the  moisture  content  (X)  is  between  2.9  and  3.8  is  then  the 
same  as  the  probability  or  area  under  the  standard  normal  curve  (Z)  between 
-1.40  and  0.66,  i.e.,  P(-1.40  <  Z  <  0.66)  as  depicted  in  Figure  2.4. 
Thus,  from  Appendix  Table  1,  we  have: 


0  0.66 


Figure  2.4     Standard  Normal  Distribution  for  Example  2.6 

P(-1.40  <  Z  <  0.66)  =  P(-1.40  <  Z  <  0)  +  P(0  <  Z  <  0.66) 

=  P(0  <  Z  <  1.40)  +  P(0  <  Z  <  0.66) 

=  0.4192  +  0.2454 

=  0.6646  . 

2.8     Distribution  of  Sample  Mean  and  Sample  Variance 

If  a  population  has  N  elements  in  it,  then  the  total  possible  number  of  distinct 
samples  of  sizen  that  can  be  selected  from  it  is  („ ).  For  each  of  these  samples 
exists  a  mean  X  and  these  X's  have  a  distribution  of  their  ^wn.  If  the  original 
population  has  a  normal  distribution,  then  the  variable  X  itself  also  has  a 
normal  distribution.  If  the  original  population  is  not  normal,  then  a  very  impor- 
tant theorem  in  mathematical  statistics,  called  the  central  limit  theorem,  tells 
us  that  the  distribution  of  X  is  approximately  normal  and  that  the  approximation 
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improves  as  n  gets  larger.  If  |x  and  a  are  the  mean  and  standard  deviation  of  the 
original  population,  then  the  mean  and  standard  deviation  of  the  sampling 
distribution  of  X,  denoted  as  |x^  and  a^  respectively,  are  given  as 


Hx   =    M-  and  °"x 


/    N  -  n 
V  n(N  -   1 


For  large  populations  (N  large),  cr^  is  approximately  equal  to  o7  Vn  and  is 
known  as  the  standard  error  of  the  mean.  Note  that  the  standard  deviation 
refers  to  the  average  variation  of  the  observations  for  individual  units  whereas 
the  standard  error  refers  to  the  random  variation  of  an  estimate  in  a  whole 
experiment.  If  a  variable  follows  any  normal  distribution,  it  may  be  reduced  to 
the  standard  normal  distribution  by  subtracting  the  mean  from  it  and  dividing 
by  its  standard  deviation.  Since  the  distribution  of  X  is  normal  with  mean  u,  and 
standard  deviation  o7Vn,  one  can  standardize  it  by  subtracting  u,  from  it  and 
dividing  by  o7Vn,  giving 

Z  =  *  ~   *  ■ 

o/Vn 


2.9     t- Distribution 

A  basic  difficulty  arises  in  that  a  is  generally  unknown  and  has  to  be  estimated 
by  the  sample  standard  deviation  s.  However,  upon  subtracting  jjl  and  dividing 

r                         X   -    (jl 
by  s/Vn,  the  variable    1=-    no  longer  has  the  standard  normal  dis- 

s/Vn 

tribution  but  follows  a  t-distribution.  The  t-distribution  is  symmetric,  bell- 
shaped,  and  centered  at  zero,  just  like  the  standard  normal  distribution.  How- 
ever, there  is  not  a  single  t-distribution  but  a  family  of  them,  each  member  of 
the  family  being  distinguished  by  its  number  of  degrees  of  freedom  (d.f.),  f, 
defined  as  f  =  n  —  1. 

Whereas,  in  the  standard  normal  distribution,  95%  of  the  area  lies  between 
±1.96  (from  Appendix  Table  1),  for  every  t-distribution,  there  is  a  number, 
which  we  denote  as  t  025,  such  that  95%  of  the  area  under  the  t  curve  lies 
between  —  t  025  and  +t  025.  The  actual  value  of  t  025  changes  for  varying 
degrees  of  freedom  and  these  are  given  in  Appendix  Table  2,  along  with  other 
values  of  t.  For  example,  let  us  find  the  two  numbers  from  the  t-distribution 
between  which  95%  of  the  area  is  centrally  contained  when  n  =  15. 

For  n  =  15,  f  —  14  and,  therefore,  t  025  =  2.1448  from  Appendix  Table  2. 
Thus,  95%  of  the  area  under  the  t-distribution  for  14  degrees  of  freedom  lies 
between  -2.1448  and  2.1448.  Notice  that,  as  n  increases,  the  values  of  t  025 
and  —  t  025  draw  closer  together.  For  a  very  large  n,  t  025  =  1.96,  the  com- 
parable value  obtained  from  the  standard  normal  distribution.  For  all  practical 
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purposes,  the  t-distribution  becomes  equivalent  to  the  standard  normal  dis- 
tribution when  n  >  30. 


2.10     Chi-Square  Distribution 


N 


Just  as  there  is  a  mean  X  for  each  of  the  ( n  )  samples  of  size  n  that  can  be 
selected  from  a  population  of  N  elements  ,_each  sample  also  possesses  a  stan- 
dard deviation  s  and  a  variance  s2.  As  X  has  a  normal  (or  approximately 
normal)  distribution,  so  does  s2  have  a  distribution.  Specifically,  we  usually 
consider  the  quantity  (n-l)s2/cr2,  which  follows  what  is  known  as  a 
chi- square  (x2)  distribution. 

Probabilities  may  be  calculated  for  (n  —  l)s2/cr2  (and  hence  for  s2)  by  finding 
appropriate  areas  under  the  x2  curve,  whose  values  are  given  in  Appendix 
Table  3.  Like  the  t-distribution,  the  x2-distribution  is  a  family  of  distributions, 
each  distinguished  by  its  number  of  degrees  of  freedom  f  =  n  —  1.  If  we  wish 
to  find  the  two  values  for  the  x  ^distribution  which  have  95%  of  the  area 
centrally  contained  between  them,  we  denote  the  smaller  value  as  x2  975  and  the 
larger  value  as  x2  025  anc*  then  obtain  these  corresponding  values  from  Appen- 
dix Table  3.  For  example,  if  n  =  10,  the  x2  975  and  x2.025  values  for  f  =  n  —  1 
=  9  degrees  of  freedom  are  respectively  given  as  2.700  and  19.023  from 
Appendix  Table  3. 


2.11    Estimation:  Confidence  Intervals 

The  sample  mean  X  and  sample  variance  s2,  calculated  from  sample  observa- 
tions to  estimate  the  corresponding  population  parameters  u,  and  a2,  are  known 
as  point  estimators  of  the  population  mean  and  variance.  However,  the  sample 
statistics  are  most  likely  to  differ  in  value  from  the  respective  population 
parameters.  Consequently,  it  is,  therefore,  generally  desirable  to  establish  an 
interval  within  which  the  population  parameters  may  be  expected  to  lie  with  a 
certain  degree  of  confidence.  This  procedure,  known  as  interval  estimation, 
provides  a  confidence  interval  which  aims  at  bracketing  the  true  value  of  a 
population  parameter  by  taking  into  account  the  uncertainty  associated  with  the 
sample  estimates.  The  level  of  confidence  is  denoted  by  1  —  a,  where  the  Greek 
letter  a  (alpha)  is  known  as  the  level  of  significance.  For  example,  when 
a  =  0.05,  the  computed  confidence  interval  will  have  a  confidence  level  of 
0.95  or  95%. 

1.  Confidence  Interval  for  the  Mean  |jl  of  a  Normal  Distribution 

(i)  Variance,  o2,  known 

The  general  100(1  —  a)%  confidence  interval  for  fx  is  given  by 


/—  a         —  q-  \ 

1      "   Z(X/2  Vn"    '    X   +   Za/2  Vn" } 
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More  specifically,  a  95%  confidence  interval  for  |x  is  given  by 

/—  a         —  o~  \ 

(X   -    1.96  — 7=    ,    X  +   1.96  -7=)  . 


Here  1  —  a  =  0.95  and  Za/2  =  Z  025  =  1.96  from  Appendix  Table  1, 
i.e.,  the  value  such  that  95%  of  the  area  under  the  standard  normal  curve 
lies  between  —  Z  025  and  +Z  025.  One  is  95%  confident  that  this  interval 
contains  jjl. 

(ii)  Variance,  cr2,  unknown 

When  the  variance  cr2  is  unknown  and  is  estimated  by  the  sample  vari- 
ance s2,  the  100(1  -  a)%  confidence  interval  for  |jl  is  given  by 

^   "   ta/2  Vn1    '    X   +   ta/2  VV   ' 

where  ta/2  is  the  t-value  read  from  Appendix  Table  2  for  (n  -  1)  degrees 
of  freedom.  For  example,  for  a  95%  confidence  interval,  the  t- values  are 
read  vertically  under  t  025  for  (n  —  1)  degrees  of  freedom. 


2.  Confidence  Interval  for  the  Variance  a2  of  a  Normal  Distribution 

To  calculate  a  95%  confidence  interval  for  cr2,  one  computes  s2,  the  sample 
variance  based  on  a  random  sample  of  size  n,  and  reads  the  values  of  x2.975  and 
X2  025  fr°m  Appendix  Table  3  for  (n  —  1)  degrees  of  freedom.  Thus,  a  95% 
confidence  interval  for  a2  is  given  as 

(n  -  1)  s2       (n  -  1)  s2 

j   

XL  -w  Z 

.025  X    .975 

More  generally,  the  100  (1  —  a)%  confidence  interval  for  cr2  is  given  by 

(n  -  1)  s2        (n  -  1)  s2 

•>    

X    a/2  X     1  -  a/2 

where  x2a/2  ana"  X2i-a/2  are  me  X2"va^ues  reac*  fr°m  Appendix  Table  3  for 
(n  —  1)  degrees  of  freedom. 

It  follows  that  a  95%  confidence  interval  for  a,  the  standard  deviation  of  a 
normal  distribution,  is  expressed  as 


(n  -  1)  s2  /  (n  -  1)  s2    \ 


V       x: 


X    .025  V  X    .975 
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3.  Confidence  Interval  for  a  Proportion 

If  a  discrete  random  variable  has  a  binomial  distribution,  one  may  be  concerned 
with  estimating  the  population  proportion  of  defectives,  p.  Suppose  that  a 
random  sample  of  size  n  is  drawn  and  X  of  the  units  are  found  defective.  Then 
X/n  measures  the  proportion  of  defectives  in  the  sample.  If  n  is  large  and  p  is 
not  too  close  to  0  or  1 ,  the  central  limit  theorem  allows  the  use  of  the  normal 
approximation  to  the  binomial,  giving  that 

X/n  -  p 


/ 


P(l-P) 


n 


approximately  has  a  standard  normal  distribution.  The  95%  confidence  interval 
for  p  then  becomes 


,*+1.96V^^)      • 
n  *  n  / 

Unfortunately,  this  interval  depends  on  p,  which,  of  course,  is  unknown.  How- 
ever, if  we  replace  p  by  its  point  estimate  X/n,  we  obtain  the  following  approxi- 
mate 95%  confidence  interval  for  p: 


Example  2.7     Confidence  Intervals  for  the  Mean  and 
Variance 

Calculate  95%  confidence  intervals  for  \x,  a2,  and  a  for  the  data  given  in 
Section  2.3  on  percent  moisture  in  skim  milk  powder  and  the  ensuing  calcula- 
tions performed  in  Example  2.2. 

Here,  X  =   3.51,  s2  =  0.1911,  s  =  0.4372,  and  n  =  50. 

The  95%  confidence  interval  for  jjl  is  evaluated  as 

I  3  51   _   (2.01)  (0.4372)        3  ^   +  (2.01)  (0.4372)  \ 


50  V50  ' 

or  (3.39  ,  3.63),  where  the  value  of  t  025  is  approximately  equal  to  2.01,  for 
49  degrees  of  freedom,  from  Appendix  Table  2.  We  are  highly  confident,  i.e., 
95%  confident,  that  the  true  mean  skim  milk  percent  moisture  content  lies 
between  3.39  and  3.63(%). 
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For  the  95%  confidence  intervals  for  ct2  and  cr,  we  first  read  the  required 
X2- values  for  49  degrees  of  freedom  from  Appendix  Table  3  as 

X2.025  =  70.222  and  X2.975  =   31.555. 
Thus,  a  95%  confidence  interval  for  a2  is  obtained  as 

/(50  -  1)  (0.1911)       (50  -  1)  (0.1911)   \ 
\        '  70.222  31.555  / 

or  (0.1333  ,  0.2967). 

The  95%  confidence  interval  for  o  is,  therefore 

(V0.1333  ,  V0.2967)  -  (0.3651  ,  0.5447). 

Example  2.8     Confidence  Interval  for  a  Proportion 

From  a  large  lot  of  apples,  a  random  sample  of  100  is  inspected,  yielding  15  bad 
apples.  Calculate  a  95%  confidence  interval  for  p,  the  true  proportion  of  bad 
apples  in  the  entire  lot. 

Here,  n   =    100,  X   =    15  ,     -   =   0.15  and 

n 

An  approximate  95%  confidence  interval  for  p  is  then  computed  as 

(0.15  -  1.96  x  0.0357  ,  0.15  +  1.96  x  0.0357)  =  (0.08  ,  0.22). 

Hence,  we  are  95%  confident  that  the  actual  proportion  of  bad  apples  in  the  lot 
is  between  8%  and  22%. 


2.12     Hypothesis  Testing 

Whereas  statistical  estimation  uses  sample  observations  to  form  point  or  interval 
estimates  of  unknown  parameters,  hypothesis  testing  is  used  to  test  the  validity 
of  certain  assumptions  made  on  these  parameters. 

Null  and  Alternative  Hypotheses 

The  main  hypothesis  that  we  test  is  called  the  null  hypothesis  and  is  denoted 
by  H0.  Any  other  complementary  hypotheses  are  called  alternative  hypotheses 
and  are  denoted  by  HA.  For  example,  in  controlling  the  fill  weight  of  canned 
peaches  on  a  production  line,  we  might  hypothesize  that  the  mean  fill  weight  is 
greater  than  32  ounces.  Thus,  we  have  H0:  u,  =  32  and  HA:  \x  >  32.  If  two 
production  lines  are  involved  in  this  process,  we  might  hypothesize  that  the 
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mean  fill  weight  is  the  same  for  both  lines.  In  this  case,  we  have  H0:  |x,  =  \x2 
and  HA:  |jl,  =£  jjl2  where  \xl  and  (jl2  refer  to  the  true  fill  weight  of  canned 
peaches  produced  on  lines  1  and  2  respectively. 

type  I  and  type  II  Errors 

In  hypothesis  testing,  two  types  of  errors  can  occur.  A  type  I  error  is  made 
upon  rejecting  H0  when  in  fact  it  is  true.  The  probability  of  committing  a  type  I 
error  is  called  the  level  of  significance  of  the  test  and  is  denoted  by  a  (alpha). 
A  type  II  error  is  made  upon  accepting  H0  when  in  fact  it  is  false.  The 
probability  of  a  type  II  error  is  denoted  by  the  Greek  letter  p  (beta)  and  1  —  p  is 
known  as  the  power  of  the  test.  Since  it  is  generally  difficult  to  predict  the 
probability  of  committing  a  type  II  error,  we  develop  our  testing  procedures  to 
accommodate  and  control  the  type  I  error. 

A  decision  to  accept  or  reject  the  null  hypothesis  is  made  by  establishing 
acceptance  and  critical  (rejection)  regions  based  on  a  confidence  level  of 

1  —  a.  The  critical  values  act  as  boundaries  to  separate  the  acceptance  region 
from  the  critical  region.  For  example,  for  a  —  0.05,  critical  values  for  the 
standard  normal  distribution  are  —  Za/2  =  —1.96  and  Za/2  =  1.96  from 
Appendix  Table  1.  If  upon  computation  the  test  statistic  Z  falls  in  the  critical 
region,  defined  by  values  less  than  —  1.96  or  greater  than  1.96,  the  null  hypoth- 
esis is  rejected  at  the  5%  level  of  significance.  The  null  hypothesis  is  accepted  if 
Z  falls  in  the  acceptance  region  bounded  by  - 1.96  and  1.96  (see  Figure  2.5). 

One  and  l\vo-Tailed  Tests 

The  critical  regions  for  a  distribution  can  vary  between  tests  conducted  at  the 
same  significance  level,  depending  on  the  statement  of  the  alternative  hypoth- 
esis. A  test  procedure  for  any  statistical  hypothesis  where  the  alternative  is  one- 
sided, such  as 

H0:  (jl   =  4,  HA:  [i  >  4     or     H0:  u,   =  4,  HA:  |x  <  4  , 

is  called  a  one-sided  or  one-tailed  test.  The  critical  region  for  the  alternative 
hypothesis  HA:  |jl  >  4  lies  entirely  in  the  right  tail  of  the  distribution,  while  the 
critical  region  for  the  alternative  hypothesis  HA:  (ll  <  4  lies  entirely  in  the  left 
tail.  In  such  cases  when  a  symmetrical  distribution  is  being  used,  if  a  =  0.05 
or  5%,  we  require  the  acceptance  region  to  comprise  the  area  under  the  curve 
on  one  side  of  the  mean  along  with  45%  from  the  mean  on  the  other  side  (see 
Figure  2.5). 

A  test  method  for  any  statistical  hypothesis  where  the  alternative  is  two-sided, 
such  as 

H0:  u,   =   4,  HA:  |jl   *   4  , 

is  called  a  two-sided  or  two-tailed  test.  The  alternative  hypothesis  states  that 
either  jx  <  4  or  |jl  >  4.  Consequently,  an  equal  area  in  both  tails  of  the  dis- 
tribution constitute  the  critical  region.  Here,  for  a  symmetrical  distribution,  if 
a  =  0.05  or  5%,  we  want  half  of  the  acceptance  region,  namely  an  area  of 
47.5%,  on  either  side  of  the  mean  (see  Figure  2.5). 
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probability  =  a 


probability  =  a 


One-Sided  Tests 


probability  =  a  /  2 


acceptance 
region 


/ 


probability  =  a  /  2 


Two-Sided  Test 

Figure  2.5     Acceptance  and  Critical  Regions  for  One  and  Two-Sided  Tests  for  Normal 
Distribution 

2.12.1     Tests  of  Significance 

1.  Testing  a  Mean  Value  (jl0,  ct2  Known:  Z-test 


Null  Hypothesis 
Test  Statistic 

Alternative 
Hypothesis 


H0:  p-   =    m-o 

X   -   u,0 
CT/Vn 


Z   = 


Reject  H0  at  the  0.05 
Level  of  Significance  if 


Z  >  1.96  or  Z  <   -1.96 
Z  >  1.645 
Z  <   -1.645 


M-  *  M-o 
M-  >  M-o 
M-  <   M-o 

2.  Testing  a  Mean  Value  |jl0,  ct2  Unknown:  t-test 

Null  Hypothesis  H0:  (jl   =   |jl0 

X  -  \x0 


Test  Statistic 

Alternative 
Hypothesis 


t  = 


,  with  f  =  (n  -  1)  d.f. 


s/vn 


Reject  H0  at  the  0.05 
Level  of  Significance  if 


M-  *  M-o 
M-  >  M-o 
M-  <   M-o 


t  >  t  025  or  t  <    -t 
t  >  t 


025 


.05 
t    <     -t 


05 
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3.  Testing  Differences  between  Two  Means:  Variances  Unknown 

Null  Hypothesis  H0:  jx,   =  |x2  or  H0:  |Xj   —  \x2  =  0 


Test  Statistic 

X,        X2 

Sp  \J{  n,     '    n2) 

with  f  =   (nj   +   n2   -   2)  d.f. 

where  sp2   = 

(n, 

-   1) 
n 

s,2 

1  + 

+   (n2   -   1)  s22 
n2   —   2 

Alternative 

Reject  H0  at  the  0.05 

Hypothesis 

Level  of  Significance  if 

|±!      *     \L2 

t  >  t  025  or  t  <    -t  025 

(X,    >    jx2 

t    >    t.05 

(Xj    <    \k2 

t    <     -t.05 

4.  Testing  a  Proporti 

ion 

Value 

Po 

Null  Hypothesis 

H0:  P   =  Po 

Test  Statistic 

X 

Z   =          n     -   Po 

./Pod        Po) 

V  n 

where  X  is  the  number  of  occurrences  for  the  attribute  of  interest  in  the 
n  trials. 

Alternative  Reject  H0  at  the  0.05 

Hypothesis  Level  of  Significance  if 

p   ±   p0  Z  >  1.96  or  Z  <   -1.96 

p  >  p0  Z  >  1.645 

p  <  Po  Z  <   -1.645 


Example  2.9    Testing  a  Mean  Value 

It  is  hypothesized  that  the  mean  weight  of  chicken  eggs  produced  by  hens  fed  a 
particular  diet  is  higher  than  55  grams.  To  test  this  assumption,  a  random 
sample  of  25  eggs  is  taken  and  each  egg  is  weighed,  yielding  X  =  56.0  g  and 
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s  =  6.0  g.  Using  a  one-sided  test  at  a  significance  level  of  5%,  test  the  hypoth- 
esis H0:  (jl  =  55  against  HA:  jjl  >  55. 

The  appropriate  test  statistic  is  given  as 

X  -   |ul0         56.0  -  55 

t   =   7^   =   =-   =   0.83  . 

s/Vn  6.0/V25 

From  Appendix. Table  2,  the  value  of  t  05  for  f  =  n  -  1  =  24  degrees  of 
freedom  is  given  as  1.71.  Since  the  calculated  t- value  is  less  than  the  tabulated 
t- value,  we  cannot  reject  the  null  hypothesis  at  the  5%  significance  level. 
Therefore,  we  have  no  significant  evidence  to  conclude  that  this  specific  diet 
produces  a  mean  egg  weight  greater  than  55  grams. 


Example  2.10     Testing  the  Difference  Between  Two  Means 

To  compare  the  mean  effect  of  two  different  treatments  on  corn  yield,  25  plots 
of  corn  are  given  treatment  1  and  20  other  similar  plots  receive  treatment  2.  The 
corresponding  sample  means  and  variances  for  the  number  of  bushels  of  corn 
harvested  per  plot  are  obtained  as  follows: 

X,   =   83,  X2   =   64,  Sj2   =   7.3,  s22  =  9.8     . 

At  the  .05  level  of  significance,  test  the  hypothesis  H0:  (Xj  =  |x2  against 
HA:  (ULj   ^   |jl2.  Here,  the  proper  test  statistic  is  given  by 

t   -  5   -  ^   t         -  83;64i         -   21.9 


where  s  2   =  (25   =   1}  (7'3)   +  (2Q  ~   1}  (9'8)  =   84 
where  sp  25   +   20-2  ^4  " 

The  tabulated  t  025  value  for  f  =  nj  +  n2  -  2  =  43  degrees  of  freedom  is 
approximately  2.02  from  Appendix  Table  2.  Since  the  calculated  t- value  of  21.9 
is  greater  than  the  tabulated  t- value  of  2.02,  we  reject  the  null  hypothesis  and 
conclude  that,  at  the  5%  level  of  significance,  the  two  treatments  exhibit  a 
statistically  significant  difference  in  mean  corn  yield. 


Example  2.11     Testing  a  Proportion  Value 

Using  the  random  sample  of  100  apples  inspected  in  Example  2.8,  investigate 
the  hypothesis,  at  the  5%  level  of  significance,  that  the  proportion  of  bad  apples 
in  the  corresponding  lot  is  at  most  12%. 
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Here,  we  need  to  test  the  hypothesis  H0:  p  =  0.12  against  HA:  p  >  0.12. 
The  applicable  test  statistic  is  calculated  as 

X 

Z   =        n    ~   Po  0-15   -  0.12         0  923 


y/^ZM        yfi 


12  x  0.88 


100 

Since  this  calculated  Z- value  of  0.923  is  less  than  1.645,  we  find  no  evidence 
from  the  data  to  reject  the  null  hypothesis  and  hence  accept  at  the  5%  signifi- 
cance level  that  at  most  12%  of  the  apples  contained  in  the  lot  are  bad. 

Furthermore,  upon  referring  to  Example  2.8,  we  note  that  p0  =  0.12  or  12% 
falls  within  the  95%  confidence  interval  evaluated  for  p,  which  confirms  our 
above  results. 


2.13     Linear  Regression  and  Correlation 


2.13.1     Linear  Regression 

Correlation  analysis  measures  the  degree  or  strength  of  association  between  two 
quantitative  variables  while  regression  analysis  further  identifies  the  nature  of 
that  relationship  for  prediction  purposes.  Predicting  the  behavior  of  two  vari- 
ables, exhibiting  a  linear  relationship,  is  achieved  through  a  straight  line  regres- 
sion equation  Y  =  A  +  BX,  where  Y  is  the  unknown  dependent  variable  and 
X  is  the  known  independent  variable.  The  constant  A,  called  the  Y-intercept,  is 
the  Y- value  at  which  the  line  intersects  the  Y-axis.  The  constant  B,  called  the 
regression  coefficient,  is  the  slope  of  the  line  and  represents  the  change  in  Y 
caused  by  a  unit  change  in  the  value  of  X. 

For  any  given  sample  of  n  observations  Y{  corresponding  to  n  selected  values 
Xj,  the  predicted  value  of  Y  for  any  fixed  value  of  X,  denoted  by  Y  (read 
Y-hat),  is  obtained  from  the  best-fitting  line  to  this  data,  derived  by  the 
method  of  least  squares  which  yields  the  (estimated)  regression  equation  as 

Y   =  a  +  bX 

_  (SX)  (SY) 

^  n  SY  /  2X 

where  b   =  /Vv^ —    and  a  =  —r  -  b 


(2X)2  ouu*    -      n      -    u{     n 


n 


Thus,  for  any  specified  value  of  X,  say  X  =  X0,  the  corresponding  predicted 
value  of  Y  is  given  as  Y  =  a  +  bX0.  For  an  interval  estimate  of  Y,  a  95% 
prediction  interval  can  be  computed  as 
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Y  ±  t025  (s,)     , 

where  t  025  is  the  tabulated  t- value  for  (n  —  2)  degrees  of  freedom  in  Appendix 
Table  2  and  s^  is  the  standard  error  (S.E.)  of  an  individual  predicted  value  for 
Y,  with 

2    _    .9     /1      , 


s  A2  =   s  2    1 1   +  —   4- 

n  (SX): 


XX2 

where  sy  x  is  the  standard  error  of  estimate  for  a  linear  regression  of  Y  on  X  and 

2         2(Y   -   Y)2        SY2   -   agY)   -   bffiXY) 
Sy  x  n~^H2  n~^~2  ' 

The  95%  prediction  interval  for  Y  can,  therefore,  also  be  written  as 


1  (X0   -   X)2      \ 

Y  ±  t  n„     /  s,  2     1   +  -  + 


025 \/   yxl  2X2  _  gX)2 


2.13.2     Correlation 

The  linear  association  between  two  variables  is  measured  by  a  correlation  co- 
efficient that  is  calculated,  for  a  sample  of  n  observations,  as 

2XY_(lXLOY) 

r   = 


^ 


sx!  -  <i»l  fev  -  ^Q-1 


The  value  of  r  may  fall  anywhere  within  the  range  —  1  to  + 1.  Negative  values 
of  r  indicate  that  an  inverse  relationship  exists  between  X  and  Y  whereas 
positive  values  of  r  are  obtained  when  there  is  a  direct  relationship  between  X 
and  Y.  If  r  =  0,  no  linear  relationship  exists  between  the  two  variables. 

It  is  also  meaningful  to  know  the  degree  or  strength  of  the  linear  relationship 
between  X  and  Y,  based  on  the  magnitude  of  the  correlation  coefficient.  This  is 
done  by  calculating  r2,  called  the  coefficient  of  determination,  which  mea- 
sures the  proportion  of  the  total  variation  in  Y  which  is  due  to  the  linear 
association  between  X  and  Y. 

To  test  the  significance  of  r,  the  population  correlation  coefficient,  denoted  by 
p  (rho),  is  set  equal  to  zero,  i.e.,  H0:  p  =  0.  The  test  statistic  is  then  evaluated 
as 
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^ 


-  2 


which  has  a  t-distribution  with  (n  —  2)  degrees  of  freedom.  The  decision  to 
accept  or  reject  H0  is  made  as  before  by  comparing  the  calculated  and  the 
tabulated  t-values  at  a  given  level  of  significance. 


Example  2.12    Regression  and  Correlation  Analysis 

In  a  study  on  the  effect  of  the  percentage  of  growth  hormone  in  the  feed  (X)  upon 
the  weight  at  20  weeks  of  turkeys  in  kg  (Y),  a  random  sample  of  birds  was 
obtained  and  the  results  recorded  as  shown  in  Table  2.3.  Perform  a  regression  and 
correlation  analysis  of  this  data. 


TABLE  2.3:     Regression  Data  for  Turkey  Study 

Hormone         ,y>. 
Percentage 

4        6        8       10       12       14       16 

18 

20 

Weight  (kg.)  (Y) 

4.1     4.9     4.6     5.3     5.8     5.6     6.4 

6.4 

6.7 

Here,  n  =  9,  2X  =  108,  %Y  =  49.8,  X  =  12,  Y  =  5.53,  2X2  =  1536, 
SY2  =  281.88,  XXY  =  635.2  . 

(108)  (49.8) 

635.2    —    

9 

Hence,  b   =   ; =   0.157  and 

(108)2 
1536   -  ^-7—- 
9 

49.8        (0.157)  (108) 
a   =  —   - =   3.649 

Thus,  the  regression  equation  is  expressed  as 

Y  =   3.649  +  0.157X 

To  draw  the  estimated  regression  line,  find  Y  for  two  selected  values  of  X  (say, 
2  and  20).  When  X  =  2,  Y  =  3.963.  When  X  =  20,  Y  =  6.789.  Join  these 
two  points  with  a  straight  edge  to  obtain  the  desired  line  (see  Figure  2.6). 
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Y  =  3.649  +  0.157X 


1        « 


2      4      6       8      10     12      14     16     18     20 
Growth  Hormone  in  Feed  (%) 

Figure  2.6     Regression  Line  for  Data  in  Table  2.3 


Let  us  construct  a  95%  prediction  interval  for  Y  when  X   =    15(%).  When 
X  =  X0  =  15,  Y  =  6.004, 

2          281.88  -  (3.649)  (49.8)  -  (0.157)  (635.2) 
sy  X   =   n   _   o =   0.0619,  and 


9-2 


0.0619 


1  +   9    + 


(15  -   12)2 


\ 


1536   - 


(108)- 


=   0.0711 


Thus,  a  95%  prediction  interval  for  Y,  when  X  =  X0  =  15,  is  given  by 
6.004  ±  2.3646  V0.0711  or  (5.373  ,  6.635)  kg,  where  2.3646  is  the  tabulated 
t  025  value  from  Appendix  Table  2  for  f  =  n-2  =  7  degrees  of  freedom. 


The  correlation  coefficient  is  given  as 

(108)  (49.8) 
ojj.z   — 

r   =    , 


=  0.9654  . 


J{ 


1536   - 


(108)- 


281.88 


(49. 8) : 


To  test  the  significance  of  this  correlation  coefficient,  i.e.,  H0:  p  =  0  against 
HA:  p  >  0,  the  pertinent  test  statistic  is  calculated  as 


t   = 


0.9654 


/  1  -r2  /   1   -   (0.9654)2 

ViT^l       V 9~^^ 


=   9.795 
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From  Appendix  Table  2,  the  t  05  value  for  n  —  2  =  9  —  2  =  7  degrees  of  freedom 
is  1.8946.  Since  the  calculated  t-value  of  9.795  is  much  greater  than  the  tabulated 
t- value  of  1.8946,  we  reject  H0  and  conclude  at  the  5%  level  of  significance  that  a 
highly  significant  positive  correlation  exists  between  the  hormone  percentage  and 
the  weight. 

We,  therefore,  conclude  from  the  above  analysis  that  a  direct  relationship  exists 
between  X  and  Y,  i.e. ,  an  increase  in  the  percentage  growth  hormone  given  in  the 
feed  produces  an  increase  in  weight  in  20-week  old  turkeys.  The  same  conclusion 
can  be  drawn  from  the  coefficient  of  determination  r2  =  (0.9654)2  =  0.9320, 
which  reveals  that  93.2%  of  the  variation  in  the  weight  of  20-week  old  turkeys  is 
explained  by  the  linear  association  existing  between  the  weight  and  the  percen- 
tage of  growth  hormone  present  in  the  feed. 
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CHAPTER  3 
SAMPLING  METHODS 


3.1     Introduction 

In  any  investigation,  whether  it  be  a  laboratory  experiment,  sampling  inspection 
of  food  products,  or  a  survey,  the  purpose  is  to  estimate  or  compare  lot  or 
population  characteristics  or  to  make  decisions  about  lot  acceptance/rejection. 
Ideally,  of  course,  we  would  like  to  be  able  to  make  use  of  the  entire  population 
to  study  the  characteristics  of  interest.  However,  for  most  enquiries,  complete 
enumeration  of  the  population  or  100%  inspection  is  either  impossible,  imprac- 
tical, time-consuming,  or  uneconomical,  creating  fatigue  and  boredom  for  the 
inspection  personnel.  Consequently,  generalizations  and  inferences  about  popu- 
lations are  derived  by  observing  the  behavior  and  properties  of  a  relatively  small 
number  of  the  population  units,  called  a  sample. 

The  purpose  of  sampling  theory  is  to  make  sampling  more  efficient,  i.e.,  to 
develop  methods  of  sample  selection  and  of  estimation  that  provide  optimum 
information  at  the  lowest  possible  cost.  The  ensuing  estimates  are  expected  to 
be  sufficiently  accurate  and  precise.  Accuracy  refers  to  the  closeness  of  a 
measured  value  to  its  true  value  and  is  measured  by  the  positive  difference 
between  the  expected  and  the  true  value.  The  precision  of  an  estimate  is  mea- 
sured by  the  amount  of  variability  among  repeated  measurements  and  is  ex- 
pressed by  a2.  Thus,  an  estimate  will  be  accurate  and  unbiased  if  the  true  value 
and  the  expected  value  are  identical  and  will  be  precise  if  a2  is  small. 


3.2     Types  of  Sampling  Enquiries 

The  nature  and  purpose  of  an  enquiry  determines  the  type  of  sampling  process 
to  be  used.  Generally,  there  are  four  types  of  enquiries: 

•  Sampling  priority:  for  a  multitude  of  products  involving  divergent  risks 

or  critical  characteristics,  to  establish  a  priority  of  sampling  inspection 
for  an  optimal  allocation  of  resources. 

•  Sampling  frequency:  determining  the  level  of  inspection  needed  for  each 

product  or  activity. 

•  Sample  size:  determining  the  amount  of  product  to  be  selected  for  inspec- 

tion for  each  activity  or  lot. 

•  Sample  selection  method:  determining  the  most  feasible  statistical 

method  for  selecting  the  designated  sample  size. 
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3.2.1     Sampling  Priority 

Sampling  priority  is  a  function  of  several  factors,  such  as: 

•  the  relative  importance  of  products  in  the  domestic,  import,  and  export 
markets 

•  the  product  characteristics  and  their  impact 

•  the  production  schedule,  volume,  and  quality/compliance  history  of  a 
product 

•  the  inspection  costs 

•  the  physical  location  of  a  product 

•  the  availability  of  inspection  resources 

•  the  impact  on  trade 

•  the  consumer  needs  and  expectations 

Sampling  priority  can  be  established  either  by  developing  a  multivariate  model 
which  takes  into  consideration  the  essential  factors  or  by  utilizing  management 
techniques  to  optimally  allocate  the  available  financial  and  human  resources. 


3.2.2     Sampling  Frequency 

Like  sampling  priority,  the  determination  of  sampling  frequency  also  depends 
on  many  factors,  including: 

•  the  production  volume,  frequency  and  schedule 

•  the  quality/compliance  history  of  the  product 

•  the  quality  management  system  of  the  organization 

•  the  distance  an  inspector  has  to  travel  to  carry  out  an  inspection 

•  the  physical  layout  of  the  product  and  facility 

•  the  required  inspection  resources  and  costs 

•  the  regulatory  requirements  and  consumer  needs 

Generally,  one  can  start  with  a  normal  frequency  of  inspection,  commensurate 
with  the  available  resources,  and  then  switch  to  either  a  tightened  or  a  reduced 
mode  depending  on  whether  the  product  quality  deteriorates  or  improves. 

Sometimes,  a  probabilistic  approach,  such  as  the  following,  can  effectively  be 
used  to  establish  sampling  frequency: 

Sampling  frequency   =   f  =   log  p2/log  pY 
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where 

p,    =   the  probability  of  failing  to  detect  the  presence  of  'trouble'  with 
the  first  sample  selected  after  'trouble'  has  entered  the  process, 

p2    =   the  probability  of  failing  to  detect  the  presence  of  such  'trouble' 
for  f  consecutive  samples. 

For  example,  if  pj  =  0.83  and  p2  =  0.05,  the  value  of  f  equals  16.  This  means 
that,  even  though  the  probability  of  failing  to  detect  the  presence  of  'trouble'  in 
the  first  sample  is  as  high  as  0.83,  the  probability  that  the  presence  of  this 
'trouble'  will  remain  undetected  while  16  consecutive  samples  are  selected  is 
not  more  than  0.05. 


3.2.3  Sample  Size 

Sample  size  determination  is  the  most  frequently  asked  question  in  an  investiga- 
tion. Sample  sizes  are  basically  required  for  two  purposes:  to  estimate  lot  or 
population  parameters  and  to  make  a  decision  on  lot  acceptance.  The  latter, 
known  as  acceptance  sampling,  is  discussed  in  Chapter  4. 

A  basic  difficulty  in  solving  most  types  of  "sample  size"  problems  is  that  often 
we  don't  know  what  we  want  and  we  lack  certain  information  necessary  for 
calculations.  To  solve  the  problem  of  determining  sample  size,  three  questions 
must  first  be  answered: 

1.  What  variation  is  expected  in  the  experiment? 

2.  What  difference  between  the  estimated  and  true  values  or  what  difference 
between  the  treatments  is  expected? 

3.  What  accuracy  of  estimation  is  desired? 

Many  short-cut  methods  have  been  used  for  determining  sample  size  such  as 
extracting  the  square  root  of  the  number  of  units  in  a  lot  or  taking  a  fixed 
percentage  of  the  units,  say  10%,  from  a  lot.  While  these  techniques  are  easy  to 
use,  they  are  not  based  upon  a  statistical  consideration  of  the  experiment.  A 
basic  formula  for  determining  sample  size  for  estimation  purposes,  assuming 
normality,  will  be  discussed  in  Section  3.3. 

3.2.4  Sample  Selection  Method 

Once  the  sample  size  has  been  designated,  the  next  question  is  how  to  select  a 
random  and  representative  sample.  Basically,  one  can  either  use  a  judgement  or 
haphazard  method  or  choose  from  the  available  probabilistic  sample  selection 
methods  discussed  in  Section  3.4. 
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3.3     Sample  Size  Determination:  A  Formula 

When  the  characteristic  under  measurement  is  approximately  normally  dis- 
tributed, a  simple  formula  for  determining  sample  size  is  given  as  follows: 


N 


n   = 


e2  (N   -    1) 
1    +    Z2  (p)  (1-p) 


where 


n  =     sample  size 

N  =     lot  size 

Z  =     standard    normal    value    for    desired    confidence    level 
(e.g.,  Z  =  1.96  for  95%  confidence) 

e  =     error  that  the  investigator  is  willing  to  tolerate  between  the 
estimated  and  the  true  value 

p  =     estimated  proportion  of  defectives. 

Note  that  the  value  of  e  is  arbitrarily  chosen  by  the  investigator;  the  smaller  the 
e  selected,  the  larger  the  sample  size  will  be.  The  value  of  p  is  obtained  from 
the  overall  process  average. 

A  corresponding  formula,  when  the  value  of  the  standard  deviation  ct  is  avail- 
able, is  given  as 

N 
n   =  


e2  (N-  1) 
1    + 


Z2ct2 

In  cases  where  N  is  very  large  or  the  sampling  fraction  ~jt"  is  close  to  one,  the 
above  formulas  reduce  to 

Z2(p)(l-p)        Z2a2 

n     =     r =     r-    • 

e2  e2 

An  appreciation  of  the  joint  effect  of  the  parameters  e  and  p  in  the  formula  for 
sample  size  determination  can  be  realized  from  Table  3.1.  The  table  gives  the 
respective  value  of  n  for  selected  values  of  N,  p,  and  e  at  a  95%  confidence 
level,  i.e.,  when  Z  =  1.96. 
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TABLE  3.1:  Effect  of  Parameters  on  Sample  Size 
(1  -  a  =  0.95) 


e 

.01 

.10 

^^\ 

.01 

.10 

.20 

.01 

.10 

.20 

500 

217 

437 

463 

4 

33 

55 

1,000 

276 

776 

861 

4 

34 

58 

2,000 

320 

1268 

1510 

4 

35 

60 

5,000 

354 

2045 

2758 

4 

35 

61 

10,000 

367 

2570 

3807 

4 

35 

62 

50,000 

378 

3234 

5474 

4 

35 

62 

Example  3.1     Sample  Size  Determination 


If  4%  of  a  manufactured  product  is  found  defective  on  a  long-term  average, 
what  minimum  sample  size  is  required  from  a  lot  of  400  units  of  this  product  so 
that  there  will  be  95%  confidence  that  the  error  in  the  mean  estimate  will  not 
exceed  3%? 

We  have  p  =  0.04,  1  -  p  =  0.96,  N  =  400,  e  =  0.03,  and  Z  =  1.96.  Then, 


N 


400 


n  = 


1  + 


e2  (N-  1) 


(0.03)2  (399) 
(1.96)2  (0.04)  (0.96) 


=  116.47 


1  + 


Z2  (p)  (1-p) 

Therefore,  under  the  conditions  specified,  a  sample  size  of  at  least  117  items  is 
required  from  a  lot  containing  400  units  of  product. 


3.4     Sample  Selection  Methods 

Once  the  sample  size  has  been  determined,  its  physical  selection  from  the  lot 
can  be  done  either  by  non-random  sample  selection  methods  or  by  probability 
sampling  procedures.  Some  of  the  non-probability  sampling  methods  are:  judge- 
ment sampling,  haphazard  sampling,  convenience  sampling,  grab  sampling, 
chunk  sampling,  quota  sampling,  etc. 

Although  there  may  arise  situations  where  only  non-probability  sampling 
methods  are  feasible,  attempts  should  be  made  to  ensure  the  randomness  and 
representativeness  of  a  sample.  Some  of  the  probability  sampling  methods  in- 
clude: simple  random  sampling,  systematic  sampling,  stratified  sampling,  clus- 
ter sampling,  etc. 
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3.4.1     Simple  Random  Sampling 

A  simple  random  sample  is  selected  from  a  lot  or  population  through  a  random 
process  where  all  the  elements  in  the  lot  have  an  equal  and  independent  chance 
of  being  included  in  the  sample.  Simple  random  samples  can  be  drawn  by  using 
tables  of  random  numbers.  Numerous  random  number  tables  are  available,  one 
of  which  is  provided  in  Appendix  Table  4  and  has  been  abstracted  from  the 
Rand  Corporation's  "A  Million  Random  Digits  with  100,000  Normal  Deviates" 
(Free  Press  of  Glenco,  New  York,  1955).  To  explain  the  use  of  random  numbers, 
the  first  150  random  numbers  in  Appendix  Table  4  are  reproduced  in  Table  3.2. 
For  larger  populations  and  the  repeated  use  of  random  numbers,  consult  the 
more  extensive  Rand  Corporation  table. 

Drawing  a  Simple  Random  Sample 

Suppose  a  simple  random  sample  of  eight  boxes  is  to  be  drawn  from  a  lot  of 
90  boxes.  The  boxes  in  the  lot  are  numerically  labelled  from  1  to  90. 

TABLE  3.2:     Random  Numbers 


93108 

77033 

68325 

10160 

38667 

62441 

87023 

94372 

06164 

30700 

28271 

08589 

83279 

48838 

60935 

70541 

53814 

95588 

05832 

80235 

21841 

35545 

11148 

34775 

17308 

88034 

97765 

35959 

52843 

44895 

Reading  two-digit  numbers  from  the  top  of  the  first  column  of  Table  3.2  identi- 
fies the  following  boxes  to  be  drawn  for  this  sample: 


93 


10 


87 


70 


33 


68 


32 
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Note  that  the  layout  of  numbers  in  groups  of  five  within  the  table  is  simply  for 
reading  convenience.  The  number  93  is  ignored  since  no  corresponding  box  can 
be  found  in  the  lot.  Proceeding  along  the  first  row,  we  then  select  the  next 
random  number  to  replace  93,  namely  01.  Similarly,  if  a  number  were  to  be 
repeated,  the  next  random  number  would  be  selected  to  take  its  place.  Thus,  the 
final  eight  boxes  chosen  to  make  up  the  required  sample  are  numbered: 


10 


87 


70 


33 


68 


32 
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1 


3.4.2     Stratified  Random  Sampling 

A  stratified  random  sample  is  one  obtained  by  separating  the  population  units 
into  some  non-overlapping  groups,  called  strata,  and  then  selecting  a  simple 
random  sample  from  each  stratum.  There  are  three  main  reasons  why  stratified 
random  sampling  often  results  in  increased  information  for  a  given  cost: 
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1.  The  data  is  more  homogeneous  within  each  stratum  than  in  the  population  as 
a  whole. 

2.  The  cost  of  conducting  the  actual  sampling  tends  to  be  lower  for  stratified 
random  sampling  than  for  simple  random  sampling  because  of  administra- 
tive convenience. 

3.  When  stratified  sampling  is  used,  separate  estimates  of  the  population 
parameters  can  be  obtained  for  each  stratum  without  additional  sampling. 

Reduced  variability  within  each  stratum  produces  stratified  sampling  estima- 
tors which  have  smaller  variances  than  do  the  corresponding  simple  random 
sampling  estimators  for  the  same  sample  size. 

For  example,  if  in  a  shell  egg  packing  station  the  boxes  of  eggs  are  placed  on 
pallets  according  to  their  grade  size,  the  population  is  naturally  divided  into 
strata  (i.e.,  small,  medium,  large  and  extra  large)  and  the  sampling  inspection 
may  be  carried  out  by  using  stratified  random  sampling. 

3.4.3  Systematic  Sampling 

A  sample  obtained  by  randomly  selecting  one  item  from  the  first  k  population 
units  and  every  kth  unit  thereafter  is  called  a  one-in-k  systematic  random 
sample.  Consider  N  population  units  numbered  serially  from  1  to  N  from  which 
a  sample  of  size  n  is  to  be  drawn.  We  find  an  integer  k,  called  the  sampling 

N 
interval,  evaluated  as  the  integer  closest  to  N/n,  i.e.,  k  —  ~^,  and  then  randomly 

select  a  number  c  between  1  and  k  inclusively.  The  required  systematic  random 
sample  then  comprises  the  units  numbered 

c,  c  +  k,  c  +  2k,  . . .,  c  +  (n  —   l)k  . 

Systematic  sampling  provides  a  useful  alternative  to  simple  random  sampling  in 
the  sense  that  it  is  easier  to  perform,  less  subject  to  error,  and  provides  greater 
information  per  unit  cost. 

For  example,  a  farmer  producing  maple  syrup  can  use  a  one-in-ten  systematic 
sample  to  determine  the  quality  of  sap  contained  in  his  maple  trees,  where  the 
total  number  of  trees  on  his  farm,  N,  is  unknown  and  he  therefore  cannot 
conduct  a  simple  random  sample. 

3.4.4  Cluster  Sampling 

A  cluster  sample  is  a  simple  random  sample  in  which  each  sampling  unit  is  a 
collection,  or  cluster,  of  elements.  The  population  is  divided  into  clusters, 
designed  to  be  as  similar  as  possible  to  one  another.  The  heterogeneity  in  the 
population  is  reflected  within  each  cluster. 

Cluster  random  sampling  is  less  costly  than  simple  or  stratified  random  sam- 
pling if  the  cost  of  obtaining  a  frame  listing  all  the  population  units  is  very  high 
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or  if  the  cost  of  obtaining  observations  increases  as  the  distance  separating  the 
units  increases. 

The  first  task  in  cluster  sampling  is  to  specify  approximate  clusters.  Elements 
within  a  cluster  are  often  physically  close  together  and  hence  tend  to  have 
similar  characteristics.  Thus,  the  amount  of  information  pertinent  to  a  popula- 
tion parameter  may  not  be  substantially  increased  as  new  measurements  are 
taken  within  a  cluster.  In  general,  the  number  of  elements  within  a  cluster  will 
be  small  relative  to  the  population  size  and  the  number  of  clusters  in  the  sample 
will  be  reasonably  large. 

To  illustrate,  suppose  a  Turkey  Marketing  Board  wishes  to  estimate  the  annual 
volume  of  turkey  purchased  per  household  in  a  widespread  thinly  populated 
county.  Travel  costs  from  household  to  household  are  substantial.  Therefore,  the 
15,000  households  in  the  county  are  listed  in  600  like  geographical  clusters  of 
25  households  each  and  a  simple  random  sample  of  30  clusters  is  selected. 


3.5     Bulk  Sampling 

Bulk  sampling  refers  to  the  sampling  of  material  which  is  available  in  bulk 
form.  Bulk  material  may  be  gaseous,  liquid  or  solid.  The  material  may  be 
homogeneous  (non-segregated),  like  acid  in  a  container,  or  it  may  be  segregated 
as  is  generally  the  case  with  bulk  material  occurring  in  nature,  like  solids  and 
liquids  shipped  in  large  tanks,  rail  cars,  and  ships  or  kept  in  stockpiles.  The 
material  may  occur  in  piles  with  no  uniquely  identifiable  subdivisions  that  can 
be  used  as  sampling  units.  It  may  also  come  packaged,  bagged  or  subdivided 
into  unique  sampling  units  practicable  for  a  routine  sampling  operation.  Further- 
more, the  material  may  be  in  a  static  condition  or  a  dynamic  situation. 

Static  situations  include  bulk  piles  at  a  manufacturer's  warehouse  or  dock,  bulk 
loads  in  transit  in  barges,  rail  cars  and  road  wagons,  bulk  heaps  or  silos  at 
farms  and  stores,  etc.  From  pure  sampling  theory,  it  is  impossible  to  obtain  a 
representative  sample  from  a  static  heap  because  one  of  the  basic  rules  of 
sampling  cannot  be  obeyed,  i.e.,  every  particle  must  have  an  equal  chance  of 
selection.  Unless  the  entire  heap  can  be  passed  through  the  sampling  device  or 
it  can  be  coned  or  sectioned  completely,  this  requirement  cannot  be  satisfied. 
Particles  in  the  very  center  or  on  the  bottom  layer  may  have  no  chance  at  all  of 
being  selected.  In  addition,  the  problems  of  segregation  in  static  heaps  are  well- 
known;  segregation  may  affect  the  distribution  of  such  characteristics  as  chemi- 
cal composition,  physical  properties,  etc.  Segregation  may  occur  because  of 
size  variation  between  the  particles  or  density  variation  between  the  constituents 
of  a  mix. 

Dynamic  situations  include  filling  a  bulk  storage  area  at  a  factory,  loading  a 
bulk  transport  carrier;  emptying  a  bulk  cargo  at  the  point  of  delivery,  etc.  In 
these  situations,  conveyor  belts  may  be  accessible  to  allow  sampling  either  by 
mechanical  or  alternate  means  or  there  may  exist  a  free-fall  position  which  will 
enable  samples  to  be  drawn  from  the  whole  falling  stream.  Dynamic  situations 
are  easier  to  handle  and  will  normally  accommodate  available  sampling  plans. 
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The  objectives  of  bulk  sampling  may  include  one  or  more  of  the  following: 

•  characterization  of  material  with  respect  to  amount,  content,  value,  grade, 
homogeneity,  etc. 

•  estimation  of  the  mean  value  of  the  characteristics  involved  as  well  as  their 
variability 

•  lot-by-lot  acceptance 

•  control  during  processing 

•  conformance  to  specifications  and  tolerances 

•  establishing  uniform  procedures  for  sampling  of  materials. 

3.5.1     Selecting  Samples  of  Segregated  Material 

Usually,  bulk  material  is  sampled  by  taking  increments  of  the  material,  blending 
these  increments  into  a  simple  composite  sample  and  then,  if  necessary,  reduc- 
ing this  gross  sample  to  a  size  suitable  for  laboratory  testing.  For  bulk  material 
involving  containers  in  batches  with  a  known  segregation  pattern,  a  nested 
sampling  plan  is  often  appropriate.  Such  a  plan  calls  for  randomly  selecting 
containers  within  these  sampled  batches,  and  finally  choosing  random  samples 
or  increments  from  these  sampled  containers.  Where  the  material  is  known  to  be 
stratified,  the  plan  may  elect  to  take  random  samples  or  increments  from  each 
stratum  or  from  a  number  of  randomly  selected  strata. 

For  example,  in  double-stage  sampling,  where  the  bulk  population  consists  of  N 
primary  units,  each  of  which  is  composed  of  M  possible  increments,  the  sam- 
pling plan  calls  for  randomly  selecting  n  primary  units  followed  by  a  random 
sampling  of  m  increments  from  each  of  these.  Upon  compositing,  this  gross 
sample  is  then  reduced  to  a  size  suitable  for  laboratory  testing. 

Although  very  little  statistical  thinking  has  gone  into  the  preparation  of  standard 
procedures  for  the  sampling  of  bulk  material,  some  situations  are  well  docu- 
mented. One  such  case  is  the  sampling  of  fertilizer.  The  Official  Methods  of 
Analysis  of  the  Association  of  Official  Agricultural  Chemists  gives  the  follow- 
ing directions  for  taking  the  original  increments  of  fertilizer  and  forming  the 
composite  sample: 

Use  slotted  single  or  double  tube,  or  slotted  tube  and  rod,  with  solid  cone  tip  at 
one  end.  Take  sample  as  follows:  lay  bag  horizontally  and  remove  core  diagonally 
from  end  to  end.  From  lots  of  10  bags  or  more,  take  core  from  each  of  10  bags. 
When  necessary  to  sample  lots  of  fewer  than  10  bags,  make  sure  that  at  least  one 
core  is  taken  from  each  bag  present.  For  bulk  fertilizers,  draw  at  least  10  cores 
from  different  regions.  Bulk  shipments  may  be  sampled  at  time  of  loading  or 
unloading  by  passing  container  through  entire  stream  of  material  as  it  drops  from 
transfer  belt  or  chute.  For  small  packages  (10  pounds  or  less),  take  one  entire 
package  as  a  sample.  Reduce  composite  to  quantity  required,  preferably  by 
riffling,  or  by  mixing  thoroughly  on  clean  oilcloth  or  paper  and  quartering.  Place 
sample  in  airtight  container. 
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To  prepare  the  sample  for  laboratory  analysis,  the  directions  read  as  follows: 

Reduce  gross  sample  to  quantity  sufficient  for  analysis  or  grind  not  less  than 
0.5  lb.  of  reduced  sample  without  previous  sieving.  For  fertilizer  materials  and 
moist  fertilizer  mixes,  grind  to  pass  sieve  with  1  mm  circular  openings,  or  No.  29 
std.  sieve;  for  dry  mixes  that  tend  to  segregate,  grind  to  pass  No.  40  std.  sieve. 
Grind  as  rapidly  as  possible  to  avoid  loss  or  gain  of  moisture  during  operation. 
Mix  thoroughly  and  store  in  tightly  stoppered  bottles. 


3.6     General  Layout  for  Sampling  Method 

The  sampling  process  has  to  be  designed  and  planned  judiciously.  A  well- 
planned  standard  layout  of  sampling  procedures  ensures  the  inclusion  of  all 
requisite  elements  and  uniformity  in  their  application.  A  suggested  general 
format  for  sampling  procedures  for  agricultural  and  food  products,  based  on  the 
international  standard  ISO/7002,  is  outlined  below.  Modifications  to  this  layout 
can  be  made,  commensurate  with  the  particular  needs  of  a  sampling  situation. 

A  generic  list  of  elements  required/recommended  for  the  layout  of  a  standard 
sampling  method  includes  the  following: 

•  Objective  (s) 

•  Principle  of  the  method  of  sampling:  essential  steps  of  the  method  to  be 

used,  nature  of  the  product  to  be  sampled,  purpose  of  sampling,  appro- 
priate sampling  plan 

•  Administrative  arrangements:  sampling  personnel,  representation  of  parties 

concerned,  health,  safety,  and  security  precautions,  signing  of  the  sam- 
pling report 

•  Identification  and  general  inspection  of  the  lot  prior  to  sampling:  identi- 

fication of  the  lot  before  sampling,  conditions  and  features  of  the  lot 
and  its  surroundings,  segregation  of  the  lot  into  homogeneous  units, 
method  of  marking  units  in  the  lot  for  random  sample  selection,  meth- 
ods of  lot  acceptance /rejection 

•  Sampling  equipment  and  ambient  conditions 

•  Sample  containers  and  special  packing:  cleanliness,  quality,  suitability, 

robustness 

•  Sampling  procedures:  sample  size,  incremental  sampling  method,  prepara- 

tion of  bulk  sample,  composite  sample  and  reduced  sample,  selection 
of  sample  of  prepackaged  products 

•  Packing,  sealing  and  marking  of  samples  and  sample  containers 

•  Precautions  during  storage  and  transportation  of  samples 
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•  Sampling  report:  administrative  details,  details  of  the  units  packed  or  the 

enclosures  containing  the  lot,  material  sampled,  sampling  method, 
preparation  and  sealing  of  samples 

•  Annexes  as  necessary:  model  reports,  cautionary  notes,  references  to  stat- 

utory regulations 
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CHAPTER  4 
ACCEPTANCE  SAMPLING 


4.1  Introduction 

Acceptance  sampling  refers  to  the  process  of  accepting  or  rejecting  a  lot  by 
inspecting  a  sample  selected  in  accordance  with  a  predetermined  sampling  plan. 
A  sampling  plan  specifies  the  number  of  units  to  be  sampled,  the  acceptance/ 
rejection  criteria,  and  the  associated  probabilities  and  risks  of  acceptance.  The 
purpose  of  acceptance  sampling  is  not  to  estimate  lot  quality  but  to  sentence 
lots.  Acceptance  sampling  plans  do  not  provide  any  direct  form  of  quality 
control;  they  are  basically  audit  tools  to  ensure  that  the  output  of  a  process 
conforms  to  the  requirements. 

Sampling  plans  are  based  on  several  quality  characteristics  and  the  choice  of  a 
particular  type  of  plan  depends  largely  on  the  nature  of  the  product  and  the 
purpose  of  inspection.  Selecting  an  adequate  and  suitable  sampling  plan,  while 
being  very  important,  is  not  always  an  easy  task  because  the  selection  is  depen- 
dent on  a  number  of  different  factors  such  as  the  ease  of  administration,  the 
protection  afforded,  the  amount  of  inspection  required,  the  cost  of  inspection, 
and  the  power  of  a  plan  to  discriminate  between  a  good  and  a  bad  lot. 

4.2  Classification  of  Sampling  Plans 

There  are  a  number  of  different  ways  to  classify  acceptance  sampling  plans.  One 
major  classification  is  by  attributes  and  variables.  Variables,  of  course,  are 
quality  characteristics  that  are  measured  on  a  continuous  numerical  scale. 
Attributes  are  quality  characteristics  that  are  expressed  on  a  dichotomous  basis 
such  as  "go,  no-go"  and  "defective,  nondefective" .  The  next  classification  of 
sampling  plans  is  with  respect  to  quality  indices  and  risks.  Some  important 
quality  indices  are  defined  as  follows: 

•  Acceptable  Quality  Level  (AQL) 

It  is  a  quality  level  which  for  the  purpose  of  sampling  inspection  is  the 
limit  of  a  satisfactory  process  average.  The  process  average  is  the  process 
level  averaged  over  a  defined  time  period  or  quantity  of  production.  The 
AQL  is  associated  with  the  a  (alpha)  risk,  also  known  as  the  "producer's 
risk",  which  is  the  probability  of  making  a  type  I  error,  i.e.,  the  risk  of 
rejecting  a  good  lot. 

•  Limiting  Quality  (LQ) 

It  is  a  quality  level  which  for  the  purpose  of  sampling  inspection  is  the 
limit  of  an  unsatisfactory  process  average.  It  is  also  known  as  "Lot  Toler- 
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ance  Percent  Defective  (LTPD)".  The  LQ  is  associated  with  the  (S 
(beta)  risk,  also  known  as  the  "consumer's  risk",  which  is  the  proba- 
bility of  making  a  type  II  error,  i.e.,  the  risk  of  accepting  a  bad  lot. 

•  Average  Outgoing  Quality  (AOQ) 

It  is  the  expected  average  quality  level  of  outgoing  product  for  a  given 
value  of  incoming  product  quality  and  is  computed  over  all  accepted  lots 
plus  all  non-accepted  lots  after  the  latter  have  been  inspected  100%  and  the 
nonconforming  items  replaced  by  good  items.  The  "Average  Outgoing 
Quality  Limit  (AOQL)"  is  defined  as  the  maximum  AOQ  over  all 
possible  values  of  incoming  product  quality,  for  a  given  acceptance  sam- 
pling plan  and  lot  disposal  specification. 


4.3     Characterization  of  a  Sampling  Plan 

Every  acceptance  sampling  plan  is  characterized  by  the  following  elements: 
sample  size  (n),  acceptance  number  (Ac)  and  rejection  number  (Re),  and  proba- 
bility of  acceptance  (Pa).  The  sample  size  is  the  number  of  items  selected  for 
inspection.  The  acceptance  number  is  the  largest  number  of  defective  items  (or 
defects)  in  the  sample  that  will  permit  the  lot  to  be  accepted.  The  rejection 
number  is  the  least  number  of  defective  items  (or  defects)  in  the  sample  that 
will  lead  to  the  rejection  of  the  lot.  The  probability  of  acceptance  of  a 
sampling  plan  is  the  percentage  of  samples  out  of  a  long  series  of  samples  that 
will  cause  the  product  to  be  accepted.  A  complete  plotting  of  the  probability  of 
acceptance  for  all  possible  levels  of  percent  defective  is  known  as  an  operating 
characteristic  (OC)  curve.  The  OC  curve  of  a  sampling  plan  quantifies  the 
risks  and  makes  it  possible  to  state  them  numerically  and  describe  the  quantity 
of  product  that  can  be  expected  to  be  accepted  if  the  quality  standard  is  met  and 
the  quantity  rejected  if  the  standard  is  not  met. 

In  practice,  a  95%  acceptance  probability  is  used  with  a  given  AQL  and  a  10% 
acceptance  probability  is  used  with  a  given  LQ.  Consequently,  a  lot  which  is  of 
AQL  quality  is  accepted  with  a  probability  of  1  —  a  =  0.95  and  a  lot  of  LQ 
quality  is  accepted  with  a  probability  of  (3  =  0.10.  Figure  4.1  shows  an  OC 
curve  for  these  specific  values  of  the  probability  of  acceptance  (Pa),  correspond- 
ing to  an  AQL  of  2%  and  an  LQ  of  8%,  for  fixed  sample  size  n  =  100  and 
acceptance  number  Ac  =  4.  As  can  be  seen  from  the  figure,  under  these 
circumstances,  a  2%  defective  product  will  be  accepted  95%  of  the  time  and,  if 
the  product  yields  an  8%  defective  rate,  it  will  only  be  accepted  10%  of  the 
time. 
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Figure  4.1     OC  Curve  for  Single  Sampling  Plan 

4.4     Choosing  a  Sampling  Plan 

Many  factors  influence  the  choice  of  an  appropriate  sampling  plan  for  a  par- 
ticular situation.  Some  basic  considerations  are:  (i)  purpose  of  inspection, 
(ii)  nature  of  product,  (iii)  type  of  testing  methods  used,  and  (iv)  nature  of  the 
lots  to  be  sampled.  The  purpose  of  the  inspection  may  be  to  make  an  accept/ 
reject  decision,  to  measure  average  quality  or  to  determine  product  variability/ 
uniformity.  The  sample  size  may  be  influenced  by  factors  associated  with  the 
material,  such  as  its  homogeneity,  unit  size,  consistency  in  meeting  prior  speci- 
fications, and  cost.  The  test  procedure  itself  may  influence  the  sampling 
procedure;  for  example,  the  test  may  consider  critical  rather  than  minor  defects, 
it  may  be  destructive  rather  than  nondestructive,  or  it  may  require  considerable 
resource  investment.  The  size  of  the  lot  may  affect  the  type  of  distribution  used 
to  set  up  a  sampling  plan  while  its  composition  and  the  extent  to  which  the 
individual  units  in  the  lot  follow  a  random  distribution  may  influence  the  nature 
of  the  sampling  plan  used. 
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Once  the  presampling  considerations  have  been  taken  into  account,  the  next 
decision  is  with  regard  to  the  choice  between  attributes  and  variables  sampling. 
In  most  practical  applications,  acceptance  sampling  is  done  on  an  attributes 
basis.  Observations  obtained  from  an  attribute  are  usually  simpler,  less  costly 
and  less  time-consuming  than  those  from  a  variable.  However,  variables  data 
provide  more  information  about  the  quality  characteristics  of  the  product  and 
may,  therefore,  require  smaller  samples.  Variables  sampling  also  gives  informa- 
tion regarding  the  degree  of  nonconformance  and  can  identify  specific  areas  in 
which  quality  improvement  is  required.  In  choosing  between  variables  and 
attributes  sampling  in  situations  where  either  could  be  used,  various  trade-offs 
should  be  considered  with  respect  to  cost  of  inspection,  sample  size,  ease  of 
administration,  etc. 

The  sampling  plans  are  generally  indexed  by  one  or  more  of  the  quality  indices, 
viz.,  AQL,  LQ,  and  AOQ. 

The  AQL  plans  are  basically  producer-oriented  plans.  They  are  used  by  the 
regulatory  authority  or  the  buyer  to  ensure  that  the  producer  is  meeting  the 
quality  and  safety  requirements  agreed  upon.  A  product  of  AQL  quality  or  better 
guarantees  a  high  probability  of  lot  acceptance  during  the  process  of  sampling 
inspection. 

The  LQ  plans,  on  the  other  hand,  are  consumer-oriented  plans.  They  are  used  to 
ensure  that  the  probability  of  acceptance  of  a  lot  of  quality  LQ  or  less  is  very 
low.  These  plans  are  generally  meant  to  be  used  for  isolated  lot  inspection. 

The  AOQ  plans  apply  only  to  programs  that  submit  rejected  lots  for  100% 
inspection.  These  plans  are  designed  so  as  to  minimize  the  process  average  and 
the  average  total  inspection  for  a  given  AOQL.  Unless  rectifying  inspection  is 
used,  the  AOQL  concept  is  meaningless. 

One  can  devise  a  sampling  plan  based  jointly  on  both  the  AQL  and  the  LQ,  if 
needed.  Alternatively,  a  plan  can  be  based  on  only  the  AQL  but  the  expected 
LQ  can  be  easily  identified,  or  vice-versa.  A  requisite  plan  can  be  derived  either 
by  using  basic  mathematical  formulas  or  it  can  be  selected  from  one  of  the 
ready-made  statistical  sampling  tables,  documents  or  standards.  For  the  sake  of 
simplicity,  we  shall  limit  our  discussion  to  the  selection  of  acceptance  sampling 
plans  using  one  of  the  international  standards.  Since  the  most  common  inspec- 
tion situation  for  food  commodities  pertains  to  lot-by-lot  inspection  of  product 
presented  by  the  producer  to  the  regulatory  body,  we  shall  describe,  in  Section 
4.6,  the  method  of  selecting  an  attributes  sampling  plan  indexed  by  AQL  using 
the  international  standard  ISO/ 2859-1. 


4.5     Ready-Made  Sampling  Plan  Systems 

For  the  benefit  of  more  ambitious  readers,  we  present,  in  this  section,  a  list  of 
important  documents  relating  to  acceptance  sampling.  Most  of  these  documents 
are  in  the  form  of  international  standards  developed  by  ISO  (International 
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Organization  for  Standardization).  For  further  reference,  the  reader  should  con- 
sult the  bibliography. 

•  Plans  Indexed  by  AQL: 

—  ISO/2859-1:  Sampling  procedures  for  inspection  by  attributes  — 
Part  1:  Sampling  plans  indexed  by  AQL  for  lot-by- lot  inspection.  This 
document  is  identical  to  MII^STD-105D  of  the  United  States  Defence 
Department. 

—  ISO/ 3951:  Sampling  procedures  and  charts  for  inspection  by  vari- 
ables for  percent  nonconforming.  This  document  is  identical  to 
MII^STD-414  of  the  United  States  Defence  Department. 

•  Plans  Indexed  by  LQ: 

—  ISO/ 2859-2:  Sampling  procedures  for  inspection  by  attributes  — 
Part  2:  Sampling  plans  indexed  by  LQ  for  isolated  lot  inspection. 

—  Dodge  and  Romig  sampling  inspection  tables. 

•  Plans  Indexed  by  AOQ: 

—  Dodge  and  Romig  sampling  inspection  tables. 

•  Other  Types  of  Sampling  Plan  Systems: 

—  ISO/ 8550:  Guide  for  selection  of  an  acceptance  sampling  system 
scheme  or  plan. 

—  ISO/2859-0:  Sampling  procedures  for  inspection  by  attributes  — 
Part  0:  Introduction  to  the  ISO/2859  attribute  sampling  system. 

—  ISO/2859-3:  Sampling  procedures  for  inspection  by  attributes  — 
Part  3:  Skip  lot  sampling  plan. 

—  ISO/ 8422:  Sequential  sampling  plans  for  inspection  by  attributes. 

—  ISO/ 8423:  Sequential  sampling  plans  for  inspection  by  variables  for 
percent  nonconforming  (known  standard  deviation). 

—  Continuous  sampling  plans. 

—  Chain  lot  sampling  plans. 

—  ISO/ 3534:  Statistics  vocabulary  and  symbols 

—  Part  1:  Probability  and  general  statistical  terms 

—  Part  2:  Statistical  quality  control 

—  Part  3:  Design  of  experiments 

4.6     AQL  Attributes  Sampling  Plans:  ISO/2859-1 

The  acceptance  sampling  standard  ISO/2859-1  has  been  developed  by  the  Inter- 
national Organization  for  Standardization,  Technical  Committee  TC69,  and  is 
based  on  the  United  States  Department  of  Defence  document  MIL-STD-105D: 
Sampling  Procedures  and  Tables  for  Inspection  by  Attributes. 

The  ISO/ 2859-1  provides  sampling  inspection  plans  by  attributes  based  on  AQL 
and  are  designed  to  be  applied  to  lots  emerging  from  long  production  runs  of 
many  units  of  product.  Information  is  also  provided  to  allow  for  easy  extraction 
of  the  protection  afforded  by  these  plans  in  terms  of  LQ  and  AOQL. 
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The  basic  aim  of  the  standard  is  the  maintenance  of  the  outgoing  quality  level  at 
a  given  "acceptable  quality  level"  or  better.  It  is  designed  such  that,  if  the 
production  runs  consistently  at  precisely  the  AQL,  then  a  large  majority  of  its 
lots  can  be  expected  to  pass  inspection.  Thus,  the  AQL  is  the  minimum  quality 
performance  at  which  the  producer  may  safely  run  his  operation;  he  is,  there- 
fore, advised  to  operate  at  a  quality  level  at  least  as  good  as  the  AQL. 

Three  types  of  sampling  plans  are  provided  in  this  standard:  single,  double  and 
multiple.  The  choice  of  which  type  to  use  depends  on  many  factors  such  as 
quality  history,  quality  requirements,  inspection  level,  lot  size,  type  of  sampling, 
AQL,  and  other  economic  considerations. 

Single  Sampling: 

In  a  single  sampling  plan,  a  single  sample  of  n  items  is  selected  at  random  from 
the  lot.  The  decision  concerning  the  acceptability  of  the  lot  is  made  on  the  basis 
of  the  results  obtained  from  this  sample.  If  the  number  of  defective  units  found 
in  the  sample  is  less  than  or  equal  to  the  acceptance  number  Ac,  the  lot  is 
accepted.  If  the  number  of  defectives  is  equal  to  or  greater  than  the  rejection 
number  Re,  the  lot  is  rejected. 

Double  Sampling: 

In  a  double  sampling  plan,  a  first  sample  of  n}  units  is  selected  at  random  from 
the  lot  and  inspected.  If  the  number  of  defectives  is  less  than  or  equal  to  the 
first  acceptance  number  Acl5  the  lot  is  accepted.  If  the  number  of  defectives  is 
equal  to  or  greater  than  the  first  rejection  number  Rel5  the  lot  is  rejected.  If  no 
decision  can  be  made  from  the  first  sample  because  the  number  of  defectives  is 
greater  than  Acj  but  less  than  Rep  a  second  sample  of  n2  units  is  selected  at 
random  from  the  lot  and  inspected.  If  the  cumulative  number  of  defectives  from 
the  first  and  second  sample  is  less  than  or  equal  to  the  second  acceptance 
number  Ac2,  the  lot  is  accepted.  And,  if  the  cumulative  number  of  defectives  is 
equal  to  or  greater  than  the  second  number  Re2,  the  lot  is  rejected. 

The  average  number  of  items  inspected  under  double  sampling  is  generally  less 
than  that  inspected  under  single  sampling.  Despite  this  smaller  sampling  rate, 
double  sampling  is  less  frequently  used  than  single  sampling  since  it  demands 
more  record  keeping.  Some  inspectors  improperly  interpret  double  sampling  as 
giving  the  product  a  second  chance.  Generally,  double  sampling  is  used  in 
situations  where  the  lot  quality  is  known  to  be  either  very  good  or  very  bad. 

Multiple  Sampling: 

The  procedure  in  multiple  sampling  is  similar  to  that  for  double  sampling  with 
the  exception  that  the  number  of  successive  samples  required  to  reach  a  deci- 
sion to  accept  or  reject  the  lot  may  be  more  than  two.  The  number  of  steps 
required  to  reach  a  firm  decision  depends  on  the  cumulative  number  of  defec- 
tives found  in  the  samples  taken  progressively.  There  is  an  acceptance /rejection 
criterion  at  each  step,  namely,  accept  the  lot  at  any  step  where  the  cumulative 
number  of  defectives  is  equal  to  or  less  than  the  acceptance  number  and  reject  it 
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whenever  the  cumulative  number  of  defectives  equals  or  exceeds  the  rejection 
number.  If  the  cumulative  number  of  defectives  is  between  the  accept/reject 
figures,  another  sample  is  drawn.  All  multiple  sampling  is  terminated  after  a 
specified  number  of  steps  by  arranging  the  acceptance  and  rejection  figures  to 
be  consecutive  at  the  last  step,  thus  forcing  a  decision  to  accept  or  reject  the  lot. 
However,  the  size  of  the  cumulative  sample  at  the  last  step  is  larger  than  the 
equivalent  in  single  and  double  sampling  plans. 

Inspection  Levels: 

Seven  inspection  levels,  S-l,  S-2,  S-3,  S-4,  I,  II,  and  III,  are  provided  in  the 
standard  for  varying  degrees  of  discrimination  and  each  level  provides  different 
sample  sizes  for  a  given  lot  size.  In  the  order  given  above,  sample  size  (and, 
therefore,  discrimination)  increases  from  a  minimum  at  special  level  S-l  to  a 
maximum  at  general  level  III.  Levels  S-l  to  S-4  are  considered  special  levels, 
which  are  limited  in  application  to  situations  where  it  is  imperative  that  only 
small  sample  sizes  be  used,  such  as  in  the  case  of  destructive  testing  of  expen- 
sive units  of  product.  General  inspection  level  II  is  considered  the  normal  level 
and  is  to  be  used  at  the  commencement  of  any  inspection  activity  unless 
otherwise  specified. 

The  standard  also  provides  three  levels  of  inspection  in  terms  of  the  severity  of 
inspection:  normal,  tightened,  and  reduced.  Normal  inspection  is  used  at  the 
start.  Then,  if  the  quality  is  shown  to  be  poor,  the  inspector  is  directed  to  be 
more  severe  in  his  inspection  and  use  the  tightened  level.  If  the  quality  is  shown 
to  be  consistently  high,  reduced  inspection  is  indicated.  Guidelines  for  estab- 
lishing switching  rules  between  the  normal,  tightened,  and  reduced  levels  of 
inspection  are  further  provided. 

Sample  Size: 

A  letter  code  system  is  used  for  determining  sample  size.  This  is  given  in 
Table  1  of  ISO/2859-1  and  as  Table  5 A  of  the  Appendix.  The  letter  assigned  to 
a  given  sample  is  dependent  on  the  inspection  level  and  the  lot  size.  In 
Table  5 A,  varying  blocks  of  lot  sizes  are  listed  vertically  against  the  inspection 
levels  listed  horizontally  so  that  any  specific  lot  may  fall  within  one  of  these 
blocks  to  provide  the  corresponding  letter  designating  sample  size. 

AQL: 

The  AQL  serves  as  a  border-line  value,  chosen  to  demarcate  what  will  and  will 
not  be  considered  acceptable  as  a  process  average,  and  as  such  is  an  indicator  of 
the  quality  required  in  production.  A  realistic  AQL  must  be  chosen  to  compro- 
mise the  capability  of  the  producer,  the  expectation  of  the  consumer,  and  the 
available  process  average.  In  ISO/2859-1,  several  AQL  values  are  given,  pro- 
gressing from  a  minimum  of  0.010  to  a  maximum  of  1000.  AQLs  from  0.010  to 
10  inclusively  may  be  expressed  either  in  percent  nonconforming  (percent  de- 
fective) or  in  nonconformities  (defects)  per  100  units.  AQLs  greater  than  10  are 
expressed  in  defects  per  100  units  only.  The  sampling  tables  provide  what  are 
known  as  "preferred  AQLs".  For  ease  of  administration,  it  is  advisable  to  use 
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preferred  AQL  values  as  much  as  possible.  However,  if  a  specified  AQL  is  not  a 
preferred  AQL,  the  tables  are  not  applicable  and  a  sampling  plan  must  be 
specially  derived  for  that  particular  AQL. 

4.6.1     Layout  of  Sampling  Plans  in  ISO/2859-1 

The  generic  layout  for  the  sampling  plans  contained  in  ISO/2859-1  is  as 
follows: 

•  Table  1  contains:  ranges  of  lot  sizes;  inspection  levels  (special  levels:  S-l, 
S-2,  S-3,  S-4  and  general  levels:  I,  II,  III);  sample  size  code  letters. 

•  Tables  2-A,B?C;  3-A,B>C;  4-A,B>C  provide:  preferred  AQLs;  sample 
sizes  for  selected  code  letters;  acceptance/rejection  numbers;  the 
type/nature  of  inspection  as  follows: 


^\Nature 
TypeX\ 

Normal 

Tightened 

Reduced 

Single 

2-A 

2-B 

2-C 

Double 

3-A 

3-B 

3-C 

Multiple 

4-A 

4-B 

4-C 

•  Tables  5-A,B  provide:  AOQL  values  for  normal  and  tightened  inspection. 
These  tables  assist  in  identifying  what  limit  of  average  outgoing  quality  to 
expect  when  using  a  specified  AQL  plan. 

•  Tables  6-A,B  and  7-A,B  provide:  LQ  values  for  normal,  single  inspection 
plans.  These  tables  assist  in  identifying,  for  a  specified  AQL  plan,  the  limit 
of  an  unsatisfactory  process  average  for  which  there  is  a  low  probability  of 
acceptance.  LQ  values  are  always  greater  than  the  AQL,  and  in  some  cases 
considerably  greater,  but  the  difference  between  the  LQ  and  the  AQL 
values  decreases  as  the  sample  size  increases. 

•  Tables  8  and  9  provide:  limit  numbers  for  reduced  inspection  and  average 
sample  size  curves  for  double  and  multiple  sampling  respectively. 

•  Tables  10- A  to  10-S  give:  for  each  sample  size  code  letter,  the  OC  curves 
and  the  exact  probabilities  of  acceptance. 

A  reproduction  of  a  few  sample  pages  from  the  standard  are  given  in  the 
Appendix  to  help  explain  the  working  of  the  document.  Thus,  Tables  1,  2-A 
and  3-A  of  the  standard  are  referenced  as  Appendix  Tables  5 A,  5B  and  5C, 
respectively. 
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4.6.2     Procedure  for  Selecting  a  Sampling  Plan  from 
ISO/2859-1 

A  step-by-step  procedure  for  using  ISO/2859-1  is  as  follows: 

1.  Decide  on  the  size  of  the  lot  (N)  which  is  to  be  sampled  and  inspected.  This 
need  not  be  a  production  lot  size. 

2.  Decide  upon  an  inspection  level.  In  general,  level  II  is  recommended  at  the 
commencement  of  inspection  activities. 

3.  Using  the  information  from  steps  1  and  2,  enter  Table  1  of  ISO/2859-1 
(Appendix  Table  5A)  to  find  the  corresponding  sample  size  code  letter, 
A,  B,  ...,  or  R,  the  latter  calling  for  the  largest  sample  sizes. 

4.  Decide  upon  single,  double,  or  multiple  sampling. 

5.  Decide  whether  to  start  with  normal  (almost  always),  tightened,  or  reduced 
sampling. 

6.  Decide  upon  the  basis  of  inspection,  viz.,  defectives  or  defects. 

7.  Decide  upon  the  desired  AQL  from  the  preferred  AQLs  available  in  the 
document,  if  at  all  possible.  AQL  values  of  10.0%  or  less  may  be  expressed 
either  in  percent  defective  or  in  defects  per  100  units;  those  over  10.0%  are 
expressed  in  defects  per  100  units  only. 

8 .  For  the  nature  and  the  type  of  sampling,  the  AQL  and  the  sample  size  code 
letter  determined  in  steps  5,  4,  7,  and  3  respectively,  enter  the  relevant  table 
in  the  document.  The  acceptance /rejection  numbers  are  commonly  given  in 
the  body  of  the  table  against  the  sample  size  (n)  listed  to  the  left. 

9.  Following  the  above,  we  may  reach  a  dot  or  an  arrow.  A  dot  denotes  the  use 
of  the  single  sampling  plan,  corresponding  to  the  desired  AQL  and  code 
letter,  instead  of  double  or  multiple  sampling  plans.  If  an  arrow  is  encoun- 
tered, follow  it  to  the  first  entry  containing  acceptance /rejection  numbers  and 
use  the  sample  size  to  the  left  of  this  entry,  not  the  one  associated  with  the 
original  code  letter. 

Example  4.1     Selecting  a  Sampling  Plan  Using  ISO/2859-1 

Consider  a  situation  of  lot-by-lot  inspection  of  skim  milk  powder  where  we  have 
lot  sizes  of  250  bags  and  will  use  inspection  level  II,  for  normal  sampling,  with 
an  AQL  of  0.40%  defective  for  the  characteristic  in  question. 

For  a  lot  size  of  250,  the  appropriate  sample  size  code  letter  is  G,  from 
Appendix  Table  5 A. 

If  a  normal  single  sampling  plan  is  desired,  we  use  Appendix  Table  5B.  In  this 
table,  in  the  column  0.40  and  the  row  G,  we  read  Ac  =  0,  Re  =  1,  and  the 
sample  size  n  =  32.  Thus,  the  plan  calls  for  taking  a  random  sample  of  32  bags 
from  the  250  in  the  lot,  inspecting  each  one  for  the  characteristic  of  interest  and, 
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finding  d  defective  bags:  (i)  accepting  the  lot  if  d  =  0,  and  (ii)  rejecting  the  lot 
if  d  ^  1. 

On  the  other  hand,  if  a  normal  double  sampling  plan  is  desired  for  the  same 
conditions,  then  we  use  Appendix  Table  5C  and  find  that  the  two  cumulative 
sample  sizes  are  n{  =  n2  =  20  and  that  a  dot  (•)  appears  as  the  entry  under  the 
column  0.40.  The  dot  (•)  indicates  two  alternatives:  either  (i)  to  use  the  corre- 
sponding single  sampling  plan  or  (ii)  to  use  the  double  sampling  plan  below, 
where  available,  in  Appendix  Table  5C.  Following  the  latter  course  of  action, 
we  obtain  Acj  =  0,  Rej  =  2,  Ac2  =  1,  and  Re2  =  2.  To  the  left  of  these  entries, 
we  now  find  nj  =  n2  =  80.  Note,  very  particularly,  that  we  do  not  use  the 
sample  sizes  nj  =  n2  =  20,  which  normally  apply  to  code  letter  G.  Thus,  our 
plan  is  to  take  a  random  sample  of  80  bags  from  the  250  in  the  lot,  inspect  each 
one,  and,  finding  dl  defective  bags:  (i)  accept  the  lot  if  dj  =  0,  (ii)  reject  the  lot 
if  dj  ^  2,  or  (iii)  take  another  sample  from  the  lot  if  dj  =  1.  If  the  latter  option 
is  required,  suppose  that  this  second  sample  of  80  from  the  170  remaining  bags 
in  the  lot  yields  d2  defective  bags.  Then,  (i)  accept  the  lot  if  dj  +  d2  =  1  and 
(ii)  reject  it  if  dj  +  d2  ^  2. 

Note:  If  a  reduced  double  sampling  plan  would  have  been  used  instead,  one 
would  have  come  across  a  gap  between  the  acceptance  and  rejection 
numbers,  specifically,  Ac2  =  0  and  Re2  =  2.  Now,  if  a  two-sample  in- 
spection revealed  dx  +  d2  =  1,  it  appears  that  no  decision  could  have 
been  reached.  This  circumstance  is  covered  in  Section  11.1.4  of  the 
standard  where  it  says  that,  if  such  an  event  occurs,  one  is  to  accept  this 
lot  (because  the  previous  quality  had  been  excellent  relative  to  the  AQL) 
but  be  alerted  to  the  possibility  that  the  quality  has  now  slipped  from  its 
previous  level  of  excellence.  Therefore,  one  is  to  abandon  reduced  sam- 
pling with  its  rather  lenient  OC  curve  and  return  to  normal  sampling 
under  the  same  conditions. 


4.7     Other  Acceptance  Sampling  Procedures 

Although  we  are  not  going  to  give  any  detailed  description  of  other  acceptance 
sampling  procedures,  a  brief  idea  of  the  principles  behind  some  of  the  more 
important  of  these  procedures  is  presented  here.  The  reader  should  consult  the 
bibliography  for  a  more  elaborate  explanation  and  usage  of  these  methods. 


4.7.1     Continuous  Sampling  Plans 

Continuous  sampling  plans  are  used  for  a  production  process  where  no  separate 
"lots"  exist.  They  are  generally  used  on  conveyors  but  are  applicable  to  any 
continuously  running  operation  where  we  do  not  wish  to  accumulate  the 
product  into  lots  for  purposes  of  inspection. 

To  give  a  general  picture  of  continuous  sampling,  suppose  we  have  a  continuous 
flow  of  product  which  is  3.0%  defective.  We  begin  to  inspect  this  product, 
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classifying  each  unit  in  order  as  defective  or  nondefective.  If  0  represents  a 
nondefective  unit  and  X  a  defective  one,  the  record  of  inspection  results  may 
resemble  the  following  pattern: 

00X  00000X  000X  00000000000X 

00000000X  000000000000000X  0000 

The  number  of  nondefective  units  between  two  consecutive  defectives,  referred 
to  as  "defective  spacing  (s)",  generally  follows  a  probabilistically  predictable 
pattern.  If  a  product  is  worse  than  3.0%  defective,  the  defective  units  will  occur 
more  frequently  and  the  spacing  s  will  tend  to  become  shorter.  If  the  product  is 
less  than  3.0%  defective,  the  defective  units  will  occur  less  often  and  the 
spacing  s  will  tend  to  become  longer.  Since  products  of  different  quality  pro- 
duce different  patterns  of  s,  it  is  possible  to  set  up  an  "acceptance  criterion"  in 
terms  of  s  which  will  reject  more  product  of  a  "bad"  level  of  quality  and  accept 
more  of  a  "good"  quality  level,  where  good  and  bad  may  be  defined  as  desired 
and  varied  for  different  applications. 

The  basic  continuous  sampling  plan,  called  CSP-1,  operates  by  specifying  a 
"clearing  interval  (i)"  which  is  a  fixed  parameter  for  a  given  continuous  sam- 
pling plan,  denoting  the  number  of  consecutive  units  to  be  inspected  and  found 
clear  of  defects  before  the  process  qualifies  for  regular  random  sampling. 

Initially,  consecutive  units  are  100%  inspected  until  the  clearing  interval  (i) 
qualification  is  met.  Thereupon,  only  a  fraction  (f)  of  the  units  are  inspected, 
selecting  individual  units  one  at  a  time  from  the  flow  of  product,  in  such  a 
manner  as  to  assure  an  unbiased  sample  of  fraction  f.  A  simple  operation 
schematic  of  CSP-1  is  shown  in  Figure  4.2. 


START 


Inspect  100%  of  units  consecutively  until 
i  units  in  succession  are  found  clear  of 
defects. 


When  i  units  in  succession  are  found  clear  of 
defects 


i 


Discontinue  100%  inspection  and  inspect  only  a 
fraction  f  of  the  units,  selecting  individual  units  one 
at  a  time  from  the  flow  of  product,  in  such  a 
manner  as  to  assure  an  unbiased  sample. 


When  a  defect  is  found 


Figure  4.2     Operation  Schematic  for  Continuous  Sampling  Plan:  CSP-1 
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When  a  unit  is  found  to  be  defective  during  the  regular  sampling  period, 
immediate  reversion  to  100%  inspection  is  again  required  until  the  qualification 
for  regular  sampling  is  met  by  again  satisfying  the  clearing  interval,  i.  Con- 
tinuous sampling  is  generally  of  the  AOQL  type,  involving  periods  of  100% 
inspection  and  periods  of  regular  random  sampling.  The  AOQL  achieved  is 
determined  by  the  selected  values  of  i  and  f.  For  further  details  regarding 
continuous  sampling  procedures,  the  reader  should  consult  Stephens  (1979). 

4.7.2  Skip-Lot  Sampling  Plans:  ISO/2859-3 

Skip-lot  sampling  is  an  acceptance  sampling  procedure  in  which  some  lots  in  a 
series  are  accepted  without  inspection  (other  than  possible  spot  checks)  when 
the  sampling  results  for  a  specified  number  of  immediately  preceding  lots  meet 
the  stated  criteria.  It  is  a  procedure  for  reducing  the  inspection  effort  on 
products  submitted  by  those  suppliers  who  have  demonstrated  their  ability  to 
control,  in  an  effective  manner,  all  facets  of  product  quality  and  consistently 
produce  superior  quality  material. 

These  plans  are  intended  only  for  a  continuous  series  of  lots  or  batches  and 
should  not  be  used  for  isolated  lots.  The  lots  to  be  inspected  are  chosen  ran- 
domly in  accordance  with  a  stated  frequency,  called  the  "skip-lot  frequency".  A 
skip-lot  frequency  of  1  lot  in  2,  for  example,  means  that  the  long-run  average 
proportion  of  inspected  lots  is  fifty  percent.  For  a  method  of  determining  the 
skip-lot  frequency  and  the  sample  size,  the  reader  should  refer  to  the  inter- 
national standard  ISO/2859-3. 

4.7.3  Sequential  Sampling  Plans  by  Attributes:  ISO/8422 

Under  a  sequential  sampling  plan  by  attributes,  units  are  selected  at  random  and 
subjected  to  inspection,  one  by  one,  and  a  cumulative  count  is  kept  of  the 
number  of  nonconforming  units  (or  of  the  number  of  nonconformities).  Follow- 
ing the  individual  inspection  of  each  unit,  this  cumulative  count  is  used  to 
assess  whether  or  not  there  is  sufficient  information  to  sentence  the  lot  at  that 
stage  of  the  inspection. 

If,  at  a  certain  stage,  the  cumulative  count  is  such  that  the  risk  of  accepting  a  lot 
of  unsatisfactory  quality  (the  consumer's  risk)  is  sufficiently  low,  the  lot  is 
considered  acceptable  and  the  sampling  of  that  lot  is  terminated. 

If,  on  the  other  hand,  the  cumulative  count  is  such  that  the  risk  of  non- 
acceptance  for  a  lot  of  satisfactory  quality  (the  producer's  risk)  is  sufficiently 
low,  the  lot  shall  be  considered  non-acceptable  and  the  sampling  of  that  lot  is 
terminated. 

If  the  cumulative  count  does  not  allow  either  of  the  above  decisions  to  be  taken, 
then  an  additional  unit  of  product  is  inspected.  The  process  is  continued  until 
sufficient  sample  information  has  been  accumulated  to  warrant  a  final  decision 
for  the  lot.  For  further  details,  consult  ISO/8422:  Sequential  Sampling  Plans  for 
Inspection  by  Attributes. 
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CHAPTER  5 
STATISTICAL  PROCESS  CONTROL 


5.1     Introduction 

The  traditional  approach  to  manufacturing  is  to  depend  on  production  to  make 
the  product  and  on  quality  control  to  inspect  the  final  product  and  screen  out 
items  not  meeting  specifications.  This  strategy  of  detection  is  often  wasteful  and 
uneconomical  because  it  involves  after-the-event  inspection  when  the  unaccepta- 
ble production  has  already  occurred.  Instead,  it  is  much  more  effective  to 
institute  a  strategy  of  prevention  to  avoid  waste  by  not  producing  unusable 
output  in  the  first  place.  This  can  be  accomplished  by  gathering  process  infor- 
mation and  analyzing  it  so  that  action  can  be  taken  on  the  process  itself.  This  is 
accomplished  through  Statistical  Process  Control  (SPC)  methods. 

The  object  of  statistical  process  control  is  to  serve  in  establishing  and  maintain- 
ing a  process  at  an  acceptable  and  stable  level  so  as  to  ensure  the  conformity  of 
products  and  services  to  specified  requirements.  The  major  statistical  tool  used 
to  achieve  this  is  the  control  chart,  which  is  a  graphical  method  of  presenting 
and  comparing  information,  based  on  a  sequence  of  samples  representing  the 
current  state  of  a  process  against  limits  established  after  consideration  of  the 
inherent  process  variability.  The  control  chart  method  helps  first  to  evaluate 
whether  or  not  a  process  has  attained,  or  continues  in,  a  state  of  statistical 
control  at  the  proper  specified  level  and  then  to  obtain  and  maintain  control  and 
a  high  degree  of  uniformity  in  important  product  or  service  characteristics  by 
keeping  a  continuous  record  of  the  quality  of  the  product  while  production  is  in 
progress. 

The  control  chart  as  a  graphical  means  of  applying  the  principles  of  statistical 
significance  to  the  control  of  the  production  process  was  first  proposed  by 
Dr.  Walter  Shewhart  in  1924.  Control  chart  theory  recognizes  two  kinds  of 
variability.  The  first  kind  is  random  variability  due  to  "chance  causes"  or 
"common  causes".  The  elimination  or  correction  of  common  causes  requires  a 
management  decision  to  allocate  resources  to  improve  the  process  and  system. 
The  second  kind  of  variability  represents  a  real  change  in  the  process.  Such  a 
change  can  be  attributed  to  some  identifiable  causes  that  are  not  an  inherent  part 
of  the  process  and  which  can,  at  least  theoretically,  be  eliminated.  These  identi- 
fiable causes  are  referred  to  as  "assignable  causes"  or  "special  causes"  of 
variation.  They  may  be  attributable  to  the  lack  of  uniformity  in  material,  work- 
manship or  procedures,  a  broken  tool,  or  to  the  irregular  performance  of  man- 
ufacturing or  testing  equipment. 

Control  charts  aid  in  the  detection  of  unnatural  patterns  of  variation  in  data 
resulting  from  repetitive  processes  and  provide  criteria  for  detecting  a  lack  of 
statistical  control.  A  process  is  in  statistical  control  when  the  variability  results 
only  from  random  or  common  causes.  Once  this  acceptable  level  of  variation  is 
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determined,  any  deviation  from  that  level  is  assumed  to  be  the  result  of  assigna- 
ble causes  which  should  be  identified  and  eliminated  or  reduced. 


5.2     Types  of  Control  Charts 

The  most  important  types  of  control  charts  commonly  used  for  process  control 
studies  are  the  following: 

•  Shewhart  System  of  Charts 

—  Charts  for  Variables:  Mean  (X)  Chart,  Range  (R)  or  Standard 

Deviation  (s)  Chart 

—  Charts  for  Attributes:  Fraction  Defective  (p)  Chart, 

Number  Defective  (np)  Chart, 
Number  of  Defects  (c)  Chart, 
Number  of  Defects  per  Unit  (u)  Chart 

—  Charts  for  Individuals  (X),  and  Moving  Ranges  (R) 

—  Median  (Me)  Chart  and  Range  (R)  Chart 

•  Cumulative  Sum  (or  Cusum)  Charts 

•  Acceptance  Control  Charts 

We  shall  limit  our  discussion  to  the  Shewhart  X  and  R  charts  for  variables,  the 
p  chart  for  attributes  and  the  Cusum  charts. 


5.3     Shewhart  System  of  Charts 

A  Shewhart  control  chart  requires  data  obtained  by  sampling  the  process  at 
regular  intervals.  The  intervals  may  be  defined  in  terms  of  time  (e.g.,  hourly)  or 
quantity  (e.g.,  every  lot)  and  determine  the  subgroups  or  samples  selected  from 
the  process.  From  each  subgroup,  one  or_more  subgroup  characteristics  are 
computed  such  as  the  subgroup  average,  X,  and  the  subgroup  range,  R,  or 
standard  deviation,  s.  The  chart  consists  of  the  values  of  a  given  subgroup 
characteristic  plotted  against  the  subgroup  numbers ,  a  central  line  (CL)  located 
at  a  reference  value,  and  two  statistically  determined  control  limits,  one  on 
either  side  of  the  central  line,  which  are  called  the  upper  control  limit  (UCL) 
and  the  lower  control  limit  (LCL)  (see  Figure  5.1). 
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Upper  Control  Limit  (UCL) 


^7 


Central  Line  (CL) 


Lower  Control  Limit  (LCL) 


12  3  4  5  6  7 

Subgroup  Number 

Figure  5 . 1     Outline  of  a  Control  Chart 

Limits  on  the  control  charts  were  proposed  and  established  by  Shewhart  at  3  a 
distance  on  each  side  of  the  central  line,  where  a  is  the  population  within- 
subgroup  standard  deviation  estimated  from  sample  ranges  or  standard  devia- 
tions. The  3a  limits  indicate  that  approximately  99.7%  of  the  subgroup  values 
will  be  included  within  these  control  limits,  provided  the  process  is  in  statistical 
control.  Interpreted  another  way,  there  is  approximately  a  0.3%  chance,  or  an 
average  of  three  times  in  a  thousand,  that  a  value  will  spuriously  fall  outside  of 
the  limits  when  the  process  is  properly  centered  and  in  control.  The  possibility 
that  a  violation  of  these  limits  is  really  a  chance  event  rather  than  a  signal  of 
real  process  change  is  considered  so  small  that,  when  a  point  plots  outside  of 
the  limits,  action  should  be  taken.  Since  action  is  required  at  this  point,  the  3a 
control  limits  are  sometimes  called  action  limits.  Frequently,  2a  limits  (called 
warning  limits)  are  also  drawn  to  warn  of  an  impending  out-of-control  situation 
when  a  value  plots  outside  of  these  limits. 

When  a  plotted  value  falls  outside  of  either  control  limit  or  a  series  of  values 
reflects  unusual  patterns,  the  state  of  statistical  control  can  no  longer  be  ac- 
cepted. When  this  occurs,  an  investigation  is  initiated  to  locate  the  assignable 
cause (s)  and  the  process  may  be  stopped  or  adjusted.  Once  the  assignable  cause 
is  determined  and  eliminated,  the  process  is  ready  to  continue.  After  a  control 
chart  has  exhibited  a  state  of  control  over  a  reasonable  number  of  subgroup 
values,  permanent  control  chart  parameters  can  be  established. 


5.4     Shewhart  Control  Charts:  Formulas  and  Factors 

Table  5.1  provides  the  necessary  formulas  for  plotting  the  control  chart  lines  for 
control  charts  by  variables,  attributes,  and  individual  values. 
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TABLE  5.1:     Control  Limit  Formulas  for  Shewhart  Control  Charts 


Type 

Central 
Line  (CL) 

Upper  Control 
Limit  (UCL) 

Lower  Control 
Limit  (LCL) 

Variables  Control  Charts 

X 

R 

s 

X 

R 

s~ 

X  4-  A2R  or  X  +  A3s~ 
D4R 
B4s~ 

X  -  A2R  or  X  -  A3s 
D3R 
B3s 

Attributes  Control  Charts 

P 

np 
c 

u 

P 

np 
c 

u 

P  +  3V*fE 

p-Wf? 

np  +  3    \/np(l  -p) 

c  +  3     Vc" 

u  +  3    yj\ 

np  -  3     \/np(l  -p) 

c  -  3     Vc1 

Control  Charts  for  Individuals 

X 

X 

X   +   E2R 

X   -   E2R 

Note:   1.  The  values  of  the  factors  A2,  A3,  D3,  D4,  B3  and  B4  are  given  below  in 
Table  5.2  for  various  values  of  the  subgroup  size  n. 

2.  The  symbols  X,  R,  and  s~  represent  the  average  of  the  subgroup 
averages,  ranges,  and  standard  deviations  respectively. 

3.  In  the  case  of  charts  for  individuals,  where  only  one  observation  per 
subgroup  is  available,  a  measure  of  variability  is  obtained  from  the 
moving  range  of  two  observations.  A  moving  range  is  the  absolute 
difference  between  successive  pairs  of  measurements  in  a  series,  i.e., 
the  positive  difference  between  the  first  and  second  measurement,  then 
that  between  the  second  and  third  measurement,  and  so  on.  From  these 
moving  ranges,  the  average  moving  range,  R,  is  calculated  and  used  in 
the  construction  of  control  charts  for  individuals.  The  value  of  the 
factor  E2  is  obtained  as  3/d2,  where  values  of  d2  are  given  in  Table  5.2 
for  the  subgroup  size  n.  For  instance,  if  a  moving  range  of  two 
observations  is  considered,  then  n  =  2  and,  therefore,  E2  =  3/1.128  = 
2.66. 
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TABLE  5.2:  Factors  for  Shewhart  Variables  Control  Charts 


Number  of 

Observations 

in  a 

A2 

A3 

D3 

D4 

B3 

B4 

d2 

Subgroup 

(n) 

2 

1.880 

2.659 



3.267 



3.267 

1.128 

3 

1.023 

1.954 

— 

2.574 

— 

2.568 

1.693 

4 

0.729 

1.628 

— 

2.282 

— 

2.266 

2.059 

5 

0.577 

1.427 

— 

2.114 

— 

2.089 

2.326 

6 

0.483 

1.287 

— 

2.004 

0.030 

1.970 

2.534 

7 

0.419 

1.182 

0.076 

1.924 

0.118 

1.882 

2.704 

8 

0.373 

1.099 

0.136 

1.864 

0.185 

1.815 

2.847 

9 

0.337 

1.032 

0.184 

1.816 

0.239 

1.761 

2.970 

10 

0.308 

0.975 

0.223 

1.777 

0.284 

1.716 

3.078 

11 

0.285 

0.927 

0.256 

1.744 

0.321 

1.679 

3.173 

12 

0.266 

0.886 

0.283 

1.717 

0.354 

1.646 

3.258 

13 

0.249 

0.850 

0.307 

1.693 

0.382 

1.618 

3.336 

14 

0.235 

0.817 

0.328 

1.672 

0.406 

1.594 

3.407 

15 

0.223 

0.789 

0.347 

1.653 

0.428 

1.572 

3.472 

5.5     Construction  of  Shewhart  Control  Charts 

The  steps  involved  in  the  construction  of  the  X  chart  and  the  R  chart  are  described 
here  as  an  example.  The  same  basic  procedure  is  followed  for  all  the  other  types  of 
control  charts. 

1.  Obtain  data,  subgroup  by  subgroup,  by  taking  20  to  25  subgroups  (k),  each 
of  size  4  or  5  (n),  and  measuring  the  characteristic  of  interest.  The  classifica- 
tion of  observations  into  subgroups  should  be  done  carefully  so  that  the 
variation  within  a  subgroup  may  be  considered  to  be  due  to  chance  causes 
only  and  the  variation  between  subgroups  may  be  attributed  to  assignable 
causes  which  the  control  chart  is  intended  to  detect. 

2.  For  each  subgroup,  calculate  the  average,  X,  and  the  range,  R. 

3.  Compute  the  grand  average  of  all  the  observation  values,  X,  i.e.,  the 
average  of  all  the  subgroup  averages,  and  the  average  of  all  the  subgroup 
ranges,  R. 

4.  On  a  suitable  form  or  graph_paper,  lay  out  an  X  and  an  R  chart.  The  vertical 
scale  on  the  left  is  used  for  X  or  for  R,  as  applicable,  and  the  horizontal  scale 
identifies  the  subgroup  number.  Plot  against  the  appropriate  subgroup  num- 
bers the  computed  values  for  X  on  the  chart  for  averages  and  the  values  for  R 
on  the  chart  for  rajiges.  On  these  respective  charts,  draw  solid  horizontal 
lines  to  represent  X  and  R,  thus  situating  the  central  lines. 
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5 .  Place  the  control  limits  on  these  charts.  On  the  X  chart,  draw  two  horizontal 
dotted  lines  at  X  ±  A2R  and,  on  the  R  chart,  draw  a  horizontal  dotted  line  at 
D3R  and  another  at  D4R,  where  A2,  D3  and  D4  depend  on  n,  the  number  of 
observations  in  a  subgroup,  and  are  given  in  Table  5.2.  The  LCL  on  the  R 
chart  is  not  needed  whenever  n  is  less  than  7  since  the  ensuing  value  of  D3  is 
considered  to  be  zero. 

6.  Plot  the  R  chart  first  by  joining  consecutive  points  with  straight  lines. 
Check  for  data  points  outside  the  control  limits,  signaling  an  out-of-control 
situation,  and  for  unusual  patterns  or  trends.  For  each  indication  of  an 
assignable  cause  in  the  range  data,  conduct  an  analysis  of  the  operation  of 
the  process  to  determine  the  cause;  correct  that  condition  and  plan  to 
prevent  its  recurrence. 

7.  Exclude  all  subgroups  affected  by  an  identified  assignable  cause;  then, 
recalculate  and  plot  the  new  average  range,  R,  with  its  revised  control 
limits.  Verify  that  all  the  range  points  now  confirm  statistical  control  when 
compared  to  the  new  limits,  repeating  the  identification/correction/ 
recalculation  sequence  if  necessary. 

8.  Any  subgroup  dropped  from  the  R  chart  because  of  identified  assign- 
able ^causes  should  also  be  excluded  from  the  X  chart.  The  revised  R 
and  X  should  be  used  to  recalculate  the  trial  control  limits  for  averages, 
X  ±  A2R. 

9.  When  the  ranges  are  in  statistical  control,  the  process  spread,  i.e.,  the 
within-subgroup  variation,  is  considered  to  be  stable.  The  averages  can  then 
be  analyzed  to  see  if  the  process  location  is  changing  over  time. 

10.  Now  plot  the  X  chart  and  check  for  data  points  outside  the  control  limits, 
signaling  an  out-of-control  condition,  and  for  unusual  patterns  or  trends. 
Like  the  R  chart,  analyze  any  out-of-control  condition  and  take  corrective 
and  preventive  action.  Exclude  any  subgroup  exhibiting  out-of-control 
points  for  which  assignable  causes  have  been  found;  recalculate  and  plot  the 
new  process  average,  X,  with  its  revised  control  limits  and  confirm  statis- 
tical control,  repeating  the  identification/correction/recalculation  sequence 
if  necessary. 

11.  When  the  range  and  average  values  are  consistently  contained  within  the 
trial  control  limits,  extend  these  limits  to  cover  future  periods.  These  limits 
would  then  be  used  for  an  ongoing  control  of  the  process,  with  the  respon- 
sible individuals  (operator  and/or  supervisor)  responding  to  signs  of  out-of- 
control  conditions  appearing  on  either  the  X  or  R  chart  with  prompt  action. 
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5.6     Process  Control  and  Process  Capability 

A  process  is  deemed  to  be  stable  if  the  points  plotted  from  the  subgroup  data 
fall  within  the  control  limits.  An  out-of-control  condition  is  specified  by  any  of 
the  following  criteria: 

•  a  point  outside  of  the  control  limits 

•  a  run  of  7  consecutive  points,  all  on  the  same  side  of  the  central  line 

•  a  run  of  7  consecutive  points,  steadily  moving  up  or  steadily  going  down 

•  any  other  obviously  nonrandom  pattern  such  as  cycles,  a  gradual  change 
in  level,  grouping  or  bunching,  interaction,  a  systematic  pattern,  trends, 
etc. 

Once  a  process  has  been  brought  under  statistical  control,  the  next  step  is  to 
study  its  capability  to  meet  the  specifications.  Process  capability  represents  the 
performance  of  the_process  itself  and  its  assessment  begins  after  all  the  control 
issues  in  both  the  X  and  R  charts  have  been  resolved,  that  is,  the  special  causes 
have  been  identified,  analyzed,  corrected  and  prevented  from  recurring. 

Process  capability  is  generally  measured  in  terms  of  a  process  capability 
index,  PCI  (or  Cp),  as  follows: 

Tolerance  Specified        UTL   -   LTL 
Process  Capability  6a 

where 

UTL  is  the  upper  tolerance  limit, 

LTL  is  the  lower  tolerance  limit,  and  _ 

R 

a  is  estimated  from  the  within-subgroup  variability  given  by  -y-  . 

a2 

It  should  be  noted  that  process  capability  for  p  and  np  control  charts  is  ex- 
pressed as  the  average  proportion  conforming  to  the  specifications.  Thus,  for  p 
and  np  control  charts,  capability  =  1  —  p.  For  c  and  u  control  charts,  process 
capability  cannot  be  expressed  in  the  same  manner.  Here,  c  and  u  are  used, 
respectively,  as  the  measures  of  process  performance. 

Example  5.1    X  Chart  and  R  Chart 

In  Table  5.3,  measurements  for  the  humidity  level  in  skim  milk  powder  samples 
are  given.  Four  independent  measurements  are  taken  for  every  half  hour  of 
production  for  a  total  of  10  hours,  giving  k  =  20  subgroup  samples  all  of  size 
n  =  4.  The  subgroup  averages  and  ranges  are  also  included  in  Table  5.3.  The 
upper  and  lower  tolerance  limits  are  specified  as  0.219%  and  0.125%,  respec- 
tively. The  objective  is  to  evaluate  the  process  performance  and  to  statistically 
control  the  process  with  respect  to  its  location  and  spread  so  that  the  process 
will  meet  the  specifications. 
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TABLE  5.3:  Production  Data  on  %  Humidity  in  Skim  Milk 
Powder 


Subgroup 
Number 

%  Humidity 

Mean 

(X) 

Range 
(R) 

-x, 

x2 

x3 

x4 

1 

0.1898 

0.1729 

0.2067 

0.1898 

0.1898 

0.0338 

2 

0.2012 

0.1913 

0.1878 

0.1921 

0.1931 

0.0134 

3 

0.2217 

0.2192 

0.2078 

0.1980 

0.2117 

0.0237 

4 

0.1832 

0.1812 

0.1963 

0.1800 

0.1852 

0.0163 

5 

0.1692 

0.2263 

0.2086 

0.2091 

0.2033 

0.0571 

6 

0.1621 

0.1832 

0.1914 

0.1783 

0.1788 

0.0293 

7 

0.2001 

0.1927 

0.2169 

0.2082 

0.2045 

0.0242 

8 

0.2401 

0.1825 

0.1910 

0.2264 

0.2100 

0.0576 

9 

0.1996 

0.1980 

0.2076 

0.2023 

0.2019 

0.0096 

10 

0.1793 

0.1715 

0.1829 

0.1961 

0.1822 

0.0246 

11 

0.2166 

0.1748 

0.1960 

0.1923 

0.1949 

0.0418 

12 

0.1924 

0.1984 

0.2377 

0.2003 

0.2072 

0.0453 

13 

0.1768 

0.1986 

0.2241 

0.2022 

0.2004 

0.0473 

14 

0.1923 

0.1876 

0.1903 

0.1986 

0.1922 

0.0110 

15 

0.1924 

0.1996 

0.2120 

0.2160 

0.2050 

0.0236 

16 

0.1720 

0.1940 

0.2116 

0.2320 

0.2024 

0.0600 

17 

0.1824 

0.1790 

0.1876 

0.1821 

0.1828 

0.0086 

18 

0.1812 

0.1585 

0.1699 

0.1680 

0.1694 

0.0227 

19 

0.1700 

0.1567 

0.1694 

0.1702 

0.1666 

0.0135 

20 

0.1698 

0.1664 

0.1700 

0.1600 

0.1666 

0.0100 

Here, 


X   = 


R   = 


SX 

k 

SR 


3.8480 
20 

0.5734 


=   0.1924  , 


=   0.0287  . 


k  20 

The  first  step  is  to  plot  an  R  chart  and  evaluate  its  state  of  control. 

R  Chart: 

Central  line   =   R   =   0.0287 

UCL   =   D4R   =   2.282    x   0.0287   =   0.0655 

LCL   =   D3R   =   0   x   0.0287   =    - 
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Note:  When  n  is  less  than  7,  the  LCL  is  not  shown. 

The  values  of  the  multiplying  factors  D3  and  D4  are  taken  from  Table  5.2  for 
n  =  4.  Since  the  R  values  in  Table  5.3  are  all  within  the  R  chart  control  limits, 
the  R  chart  indicates  a  state  of  statistical  control  with  respect  to  the  spread.  The 
R  value  can  now  be  used  to  calculate  the  X  chart  control  limits. 


X  Chart: 

Central  line   =  X   =   0.1924 

UCL   =   X   +   A2R   =   0.1924   +   (0.729   x  0.0287) 

(0.729   x   0.0287) 


LCL   =   X   -   A2R   =   0.1924 


0.2133 
0.1715 


The  value  of  the  factor  A2  is  obtained  from  Table  5.2  for  n  =  4.  The_control 
charts  for  X  and  R  are  graphed  in  Figure  5.2.  An  examination  of  the  X  chart 
reveals  that  the  last  three  points  are  signaling  an  out-of-control  condition.  They 
indicate  that  some  assignable  causes  of  variation  are  operating  in  the  process.  If 
the  control  limits  had  been  calculated  from  some  previous  data,  action  would 
have  been  called  for  at  subgroup  18. 


IX 


c 


--    UCL   -  0.2133 


X  =  0.1924 
LCL   =   0.1715 


-    UCL   =  0.0655 


R   =  0.0287 


4        6       8        10      12       14      16       18      20 
Subgroup  Number 


Figure  5.2     Average  and  Range  Charts  for  Data  in  Table  5.3 
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At  this  point,  suitable  remedial  action  is  taken  to  eliminate  the  assignable 
causes  and  prevent  their  recurrence.  The  charting  procedure  is  continued  by 
establishing  revised  control  limits  by  discarding  the  out-of^control  points,  i.e., 
the  values  for  subgroups  18,  19  and  20.  The  new  values  for  X,  R  and  the  control 
chart  lines  are  recalculated  as  follows: 


Revised  X   = 

sx 

k 

3.3454 
17 

=   0.1968 

Revised  R   = 

XR 
k 

0.5272 
17 

=   0.0310 

Revised  X  Chart: 

Central  line   =  X   =   0.1968 

UCL   =   X   +   A2R   =   0.1968   +   (0.729   x   0.031)   =   0.2194 

LCL   =   X   -   A2R   =   0.1968   -   (0.729   x   0.031)   =   0.1742 

Revised  R  Chart: 

Central  line   =  R   =   0.0310 

UCL   =   D4R   =   2.282   x   0.0310   =   0.0707 

LCL   =   D3R   =   0   x   0.0310   =    - 

Note:  When  n  is  less  than  7,  the  LCL  is  not  shown. 

The  revised  control  charts  are  plotted  in  Figure  5.3  and  indicate  that  the  process 
shows  a  state  of  statistical  control  with  respect  to  both  location  and  spread. 


c 

£>  0.01- 


4      6     8     10    12   14    16 
Subgroup  Number 


Figure  5.3     Revised  X  and  R  Charts  for  Data  in  Table  5.3 
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With  the  process  exhibiting  a  state  of  statistical  control  under  the  revised  control 
limits,  the  process  capability  index  can  now  be  evaluated  as  follows: 

Tolerance  Specified        UTL   —   LTL 

PCI   = ^ =   

Process  Capability  6a 

0.031 

where  o  is  estimated  as  R/d,    =   =   0.0151  . 

2         2.059 

Thus, 

0.219   -   0.125        0.094 

PCI   =   =  =    1.04  . 

6   x   0.0151  0.0906 

The  value  of  the  quantity  d2  is  obtained  from  Table  5.2  for  n  =  4.  Since  PCI  is 
greater  than  1 ,  the  process  can  be  considered  capable.  However,  on  close  exam- 
ination, it  can  be  seen  that  the  process  is  not  centered  properly  with  respect  to 
the  specifications  and,  consequently,  about  11.8%  of  the  individual  measure- 
ments fall  above  the  upper  specification  limit.  Therefore,  before  permanent 
control  chart  parameters  are  established,  attempts  should  be  made  to  center  the 
process  properly  while  maintaining  a  state  of  statistical  control. 


Example  5.2    Fraction  Nonconforming  (p)  Chart 

In  a  food  processing  company,  it  was  decided  to  install  a  fraction  nonconform- 
ing p  chart  to  control  the  performance  of  the  machine  labelling  the  canned  food 
products.  Data  was  collected  and  analyzed  for  a  period  of  26  working  days. 
From  each  day's  production,  a  random  sample  of  cans  was  collected  at  the  end 
of  the  day  and  each  one  examined  for  nonconformance  in  labelling.  The  results 
are  shown  in  Table  5.4. 
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TABLE  5.4:  Process  Control  for  Labelling 

Machine 

Number 

Number 

Fraction 

Subgroup 

of  Cans 

Noncon- 

Noncon- 

UCL 

LCL 

Number 

Inspected 

forming 

forming 

1 

158 

11 

0.070 

0.117 

0.003 

2 

140 

11 

0.079 

0.120 

3 

140 

8 

0.057 

0.120 

4 

155 

6 

0.039 

0.117 

0.003 

5 

160 

4 

0.025 

0.116 

0.004 

6 

144 

7 

0.049 

0.119 

0.001 

7 

139 

10 

0.072 

0.120 

8 

151 

11 

0.073 

0.118 

0.002 

9 

163 

9 

0.055 

0.116 

0.004 

10 

148 

5 

0.034 

0.119 

0.001 

11 

150 

2 

0.013 

0.118 

0.002 

12 

153 

7 

0.046 

0.118 

0.002 

13 

149 

7 

0.047 

0.118 

0.002 

14 

145 

8 

0.055 

0.119 

0.001 

15 

160 

6 

0.038 

0.116 

0.004 

16 

165 

15 

0.091 

0.115 

0.005 

17 

136 

18 

0.132 

0.121 

18 

153 

10 

0.065 

0.118 

0.002 

19 

150 

9 

0.060 

0.118 

0.002 

20 

148 

5 

0.034 

0.119 

0.001 

21 

135 

0 

0.000 

0.121 

22 

165 

12 

0.073 

0.115 

0.005 

23 

143 

10 

0.070 

0.120 

0.000 

24 

138 

8 

0.058 

0.121 

25 

144 

14 

0.097 

0.119 

0.001 

26 

161 

20 

0.124 

0.116 

0.004 

Total 

3893 

233 

The  values  of  the  fraction  nonconforming,  p,  calculated  for  each  subgroup,  is 
also  given  in  Table  5.4.  The  average  fraction  nonconforming  for  the  month  is 
calculated  as  follows: 


P   = 


Total  number  nonconforming         233 


Total  number  inspected 


3893 


=   0.060 
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Since  the  subgroup  sizes  are  different,  the  UCL  and  LCL  values  are  calculated 
for  each  subgroup  separately  from 


«  /p(l  -p) 

where  n  is  the  size  of  the  subgroup.  These  values  are  also  included  in  Table  5.4. 

It  can  be  seen  that  plotting  the  UCL  and  LCL  values  for  each  subgroup  is  a 
time-consuming  task.  However,  it  can  be  observed  from  Table  5.4  that  the 
fraction  nonconforming  for  subgroups  17  and  26  are  falling  outside  their  corre- 
sponding upper  control  limits.  These  two  subgroups  are  eliminated  from  the 
data  as  they  are  shown  to  be  subject  to  variations  other  than  those  affecting  the 
other  subgroups.  To  include  them  in  the  computations  would  result  in  an  over- 
estimated process  average  and  control  limits  which  would  not  reflect  the  true 
random  variations  in  the  process.  The  causes  of  these  high  values  should  be 
sought  so  that  corrective  action  may  be  taken  to  prevent  future  occurrences.  A 
revised  average  fraction  nonconforming  is  then  calculated  from  the  remaining 
24  subgroup  values  as 

195 

p  -  l^  -  0054  • 

Calculating  the  revised  UCL  and  LCL  values  for  each  subgroup,  by  using  the 
revised  p  value,  would  reveal  that  all  the  fractions  nonconforming  are  within 
their  corresponding  UCL  values.  Hence,  this  revised  value  of  p  is  taken  as  the 
standard  fraction  nonconforming  for  the  purpose  of  constructing  control  charts 
so  that  the  central  line  is  situated  at  p   =   0.054. 

As  remarked  above,  the  plotting  of  upper  control  limits  for  each  subgroup  of 
varying  size  is  a  time-consuming  and  tedious  process.  However,  since  the  sub- 
group sizes  do  not  vary  widely  from  the  average  subgroup  size,  which  is 
evaluated  as  approximately  150,  the  revised  p  chart  can  be  plotted  with  an  upper 
control  limit  based  on  the  average  subgroup  size  of  n  =  150.  Thus,  the  revised  p 
chart  lines  are  calculated  as  follows: 

Revised  p  Chart: 

Central  line   =  p   =   0.054 


/p(l  -p)  k  /0.054   x   0.946 

UCL   =   p  +   3  V  — &-     =   0.054  4-  3  xV — 

▼         n  ▼             150 

=   0.109 


LCL   =   p   -   3  yP(1      P)      =   0.054  -  3  xyf- 


0.054   x   0.946 


150 
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Note:  Since  negative  values  for  p  are  not  possible,  the  lower  control  limit  is  not 
shown. 

The  revised  p  chart  is  plotted  in  Figure  5.4  and  illustrates  that  the  process  is 
exhibiting  a  state  of  statistical  control. 


UCL  =  0.109 


-  p  =  0.054 


Subgroup  Number 
Figure  5.4     Revised  p  Chart  for  Data  in  Table  5.4 

5.7     Cumulative  Sum  Control  Charts 

The  cumulative  sum  charts,  more  commonly  known  as  Cusum  charts,  were 
developed  by  Page  in  1961  as  an  alternative  to  Shewhart  control  charts  to 
exercise  tighter  process  control  through  faster  detection  and  correction  of  small 
process  deviations  from  control.  The  Shewhart  control  chart  uses  only  the  infor- 
mation about  the  process  contained  in  the  last  plotted  point  and  ignores  any 
information  contributed  by  the  entire  sequence  of  points,  except  by  way  of  tests 
for  runs  or  the  use  of  warning  limits.  The  Cusum  chart  directly  incorporates 
information  about  the  whole  sequence  of  sample  values  by  plotting  the 
cumulative  sum  of  the  deviations  of  the  sample  values  from  a  preselected  target 
value. 

Both  the  Shewhart  chart  and  Cusum  chart  have  their  own  merits  and  demerits. 
Whereas  the  Shewhart  chart  is  more  effective  in  detecting  larger  short-term 
changes  in  the  process  level,  the  Cusum  chart  is  more  effective  in  detecting 
sustained  changes  within  the  region  0.5a  to  2a.  The  Cusum  chart  possesses 
greater  sensitivity  in  visually  detecting  small  process  shirts  and  noting  the  time 
at  which  the  change  (s)  occurred.  However,  it  is  slow  in  detecting  large  process 
shifts  and  the  diagnosis  of  patterns  is  difficult  because  the  sequence  of  points 
are  not  independent  and  uncorrected.  A  cautious  analyst  would  make  use  of 
both  types  of  charts,  Cusum  charts  for  quickly  detecting  small  process  changes 
and  Shewhart  charts  for  analyzing  past  data  to  detect  lack  of  control  and  to 
bring  a  process  under  statistical  control. 
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5.7.1     Procedure  for  Cusum  Charts 

The  Cusum  chart  is  a  graphic  plot  of  the  running  summation  of  the  process 
deviations  from  a  control  or  target  value.  Its  underlying  mathematical  concept 
and  decision  rules  are  highly  involved  but  its  construction  is  simple. 

1.  Identify  the  control  or  target  value,  T.  The  target  value  is  the  accepted  or 
expected  value  of  the  variable  under  examination. 

2.  Calculate  the  deviations  of  T  from  each  observed  subgroup  average,  i.e., 
X;  —  T,  where  X{  is  the  average  of  the  ith  subgroup  measurements  for 
i  =  1,  2,  . .. ,  k. 

3.  Calculate  and  plot  the  cumulative  sum  of  these  deviations, 

Sk   =     1    (X,   -   T)  , 

i=  1 

against  the  total  number  of  subgroups  recorded,  k,  on  suitable  graph  paper. 
A  convenient  scaling  convention  is  to  regard  the  horizontal  distance  between 
consecutive  plotted  Cusum  (Sk)  values  on  the  X-axis  as  one  unit  and  have 
the  same  distance  on  the  vertical  scale  or  Y-axis  to  be  approximately  2o 
units,  where  a  is  estimated  as  the  average  of  the  within-subgroup  standard 
deviations,  similar  to  that  for  Shewhart  charts. 


5.7.2     Decision  Rules  for  Cusum  Charts 

If  a  statistical  tool  is  to  have  wide  practical  usage  by  non-statisticians,  then  a 
reasonable  degree  of  simplicity  and  standardization  becomes  extremely  valu- 
able. The  Shewhart  control  charts  with  2a  and  3cr  limits  possess  these  virtues 
and,  consequently,  owes  much  of  its  wide  acceptance  to  them.  The  exact  deci- 
sion rules  for  Cusum  charts,  on  the  other  hand,  are  very  involved  mathe- 
matically. These  rules  require  the  construction  of  a  V-mask  based  on  the 
approximate  solutions  of  a  random  walk  with  absorbing  barriers  on  the  edge  of 
the  V.  The  V-mask  is  superimposed  on  the  Cusum  chart  and  defines  a  decision 
interval.  The  process  is  then  considered  to  be  performing  satisfactorily  as  long 
as  all  the  previously  plotted  points  are  visible  and  not  obscured  by  the  mask. 

Instead  of  going  through  the  mathematical  derivation  of  V-masks  here,  we  shall 
limit  our  discussion  to  some  simple  decision  rules  useful  in  studying  the  state  of 
control  of  a  process  from  a  Cusum  chart. 

As  a  general  rule,  it  should  be  noted  that  the  Cusum  graph  is  essentially 
horizontal  when  the  process  is  in  control  at  the  target  value  T  If  the  mean  shifts 
upward  to  some  value  T,  >  T,  then  an  upward  or  positive  drift  will  develop  in 
the  cumulative  sum.  If  the  mean  shifts  downward  to  some  value  T2  <  T,  a 
downward  or  negative  drift  will  develop.  Therefore,  if  a  trend  develops  in  the 
plotted  points  either  upward  or  downward,  this  serves  as  an  indication  of  a  shift 
in  the  process  mean  and  a  search  should  be  initiated  for  some  assignable 
cause(s). 
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Two  simple  methods  are  described  below  to  determine  when  a  turning  point  on 
the  Cusum  chart  is  significant. 

Method  1: 

Draw  a  straight  line  by  eye  to  the  points  on  the  cumulative  sum  chart,  extending 
the  direction  that  this  graph  would  have  taken  in  the  absence  of  the  apparent 
upward  change,  i.e.,  estimate  the  average  value  of  the  points  before  the  sus- 
pected point  of  change  (see  Figure  5.5).  At  this  point,  mark  a  point  A  at  a 
distance  of  5a  above  the  value  of  the  cumulative  sum,  where  cr  is  the  standard 
deviation  of  the  short-term  variability  of  the  series  and  is  estimated  as  given 
above.  This  identifies  the  decision  interval.  Then,  at  a  point  ten  intervals  further 
along  the  chart,  mark  a  point  B  at  a  height  of  10o  above  the  position  that  the 
graph  would  have  reached  if  there  had  been  no  apparent  change.  Draw  the 
decision  line  AB.  If  the  plotted  points  fall  above  the  line  AB,  a  change  is 
declared  at  the  suspected  point.  Unless  the  graph  crosses  this  decision  line 
before  the  next  suspected  point  of  change,  no  significant  change  at  point  A  is 
indicated.  At  any  further  indication  of  a  change  in  slope,  a  new  decision  line  is 
similarly  drawn  and  this  suspected  change  is  again  assessed  by  reference  to  the 
new  decision  line. 


go 

e 

GO 
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B 

3 

u 


Subgroup  Number  (k) 


Figure  5.5     Decision  Interval:  Method  1 
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Method  2: 

This  method  is  an  extension  of  Method  1  and  requires  the  development  of  a 
truncated  V-mask.  The  V-mask  may  be  cut  out  from  cardboard  or  paper  or  a 
transparent  material,  which  may  be  more  useful.  The  construction  of  the  mask  is 
detailed  in  Figure  5.6.  This  mask  is  used  by  placing  the  datum  A  over  any  point 
on  the  chart;  this  will  often  be  the  most  recently  plotted  point  or  the  last  point  in 
some  segment  of  particular  interest.  The  AF-axis  is  laid  parallel  to  the  subgroup 
number  axis  of  the  chart,  both  axes  possessing  subgroup  intervals  of  equal 
length.  If  the  Cusum  values  wander  beyond  the  sloping  axes  BD  or  CE,  called 
the  decision  lines,  a  significant  departure  from  the  target  value  is  signaled. 
However,  if  the  entire  Cusum  path  remains  inside  these  arms,  no  significant 
shift  is  indicated. 


10         8         6         4         2 
Subgroup  intervals 


u 


>   Decision 
intervals 


-&&» 


\\ue 


Figure  5.6     Decision  Rule  with  Truncated  Mask:  Method  2 


Example  5.3     Cusum  Chart  with  Truncated  V-mask 

Table  5.5  gives  33  subgroup  average  values  for  a  food  production  process 
observed  over  a  particular  time  sequence.  A  target  value  of  T  =  15  was  deemed 
appropriate. 
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TABLE  5.5:  Data  for  Cusum  Analysis 

Subgroup 

Number  (k) 

X 

X   -  T 

Sk   =  S(X   -  T) 

1 

12 

-3 

-3 

2 

17 

+  2 

-1 

3 

14 

-1 

-2 

4 

14 

-1 

-3 

5 

17 

+  2 

-1 

6 

16 

+  1 

0 

7 

14 

-1 

-1 

8 

11 

-4 

-5 

9 

13 

-2 

-7 

10 

14 

-1 

-8 

11 

15 

0 

-8 

12 

11 

-4 

-12 

13 

14 

-1 

-13 

14 

16 

+  1 

-12 

15 

13 

-2 

-14 

16 

14 

-1 

-15 

17 

11 

-4 

-19 

18 

12 

-3 

-22 

19 

13 

-2 

-24 

20 

16 

+  1 

-23 

21 

12 

-3 

-26 

22 

18 

+  3 

-23 

23 

18 

+  3 

-20 

24 

17 

+  2 

-18 

25 

20 

+  5 

-13 

26 

15 

0 

-13 

27 

14 

-1 

-14 

28 

18 

+  3 

-11 

29 

20 

+  5 

-6 

30 

16 

+  1 

-5 

31 

18 

+  3 

-2 

32 

14 

-1 

-3 

33 

16 

+  1 

-2 

The  Cusum  chart  is  plotted  in  Figure  5.7.  For  comparative  analysis,  a  con- 
ventional Shewhart  chart  is  also  plotted  in  Figure  5.8. 
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Figure  5.7     Cusum  Chart  for  Data  in  Table  5.5 
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Figure  5.8     Conventional  Chart  for  Data  in  Table  5.5 

As  can  be  seen  from  Figure  5.7,  the  chart  clearly  divides  into  three  distinct 
segments.  From  subgroup  1  to  7,  the  Cusum  path  is  roughly  horizontal  about 
zero,  suggesting  that  these  observations  come  from  a  population  whose  mean  is 
close  to  the  target  value.  From  subgroup  8  to  21,  the  path  is  recognizably 
moving  downward  and  the  observations,  therefore,  appear  to  have  been  sampled 
from  a  population  whose  mean  is  below  15.  Finally,  from  subgroup  22  to  33, 
the  path  is  shifting  upward,  indicating  that  the  observations  belong  to  a  popula- 
tion whose  mean  is  greater  than  15. 

The  conventional  Shewhart  chart,  on  the  other  hand,  only  indicates  that  the  last 
dozen  values  are  clustered  around  a  different  mean  level  than  the  first  20  or  so. 
Plotting  in  the  Cusum  mode,  therefore,  provides  a  much  clearer  picture  of  the 
trends  actually  present  in  the  data. 

Figures  5.9  and  5.10  show  a  truncated  V-mask  applied  at  two  different  points  on 
the  Cusum  chart.  The  value  of  a  is  assumed  to  be  2.0.  With  the  mask  applied  at 
subgroup  16,  the  Cusum  path  remains  well  within  the  sloping  arms  and  so  the 
segment  from  subgroup  8  to  16  does  not  significantly  differ  from  the  target 
value  of  T  =  15.  However,  when  the  mask  is  applied  at  subgroup  18,  the  Cusum 
path  is  seen  to  touch  the  upper  decision  line,  indicating  that,  by  the  time  the 
eighteenth  subgroup  average  value  is  obtained,  sufficient  evidence  has  been 
accumulated  to  signal  a  significant  downward  shift  from  the  target  value. 
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Figure  5.9     Truncated  V-Mask  Applied  to  Cusum  Chart  (Figure  5.7)  at  Subgroup  16 
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Figure  5.10     Truncated  V-Mask  Applied  to  Cusum  Chart  (Figure  5.7)  at  Subgroup  18 


At  this  point,  corrective  action  is  taken  to  identify  and  eliminate  the  assignable 
causes  of  variability  and  the  process  is  reset  as  needed.  Further  data  plotting  and 
Cusum  analysis  is  then  performed  until  the  process  exhibits  a  state  of  statistical 
control. 
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TABLE  1.     Normal  Curve  Areas 


0 


z 

.00 

.01 

.02 

.03 

.04 

.05 

.06 

.07 

.08 

.09 

0.0 

.0000 

.0040 

.0080 

.0120 

.0160 

.0199 

.0239 

.0279 

.0319 

.0359 

0.1 

.0398 

.0438 

.0478 

.0517 

.0557 

.0596 

.0636 

.0675 

.0714 

.0753 

0.2 

.0793 

.0832 

.0871 

.0910 

.0948 

.0987 

.1026 

.1064 

.1103 

.1141 

0.3 

.1179 

.1217 

.1255 

.1293 

.1331 

.1368 

.1406 

.1443 

.1480 

.1517 

0.4 

.1554 

.1591 

.1628 

.1664 

.1700 

.1736 

.1772 

.1808 

.1844 

.1879 

0.5 

.1915 

.1950 

.1985 

.2019 

.2054 

.2088 

.2123 

.2157 

.2190 

.2224 

0.6 

.2257 

.2291 

.2324 

.2357 

.2389 

.2422 

.2454 

.2486 

.2517 

.2549 

0.7 

.2580 

.2611 

.2642 

.2673 

.2704 

.2734 

.2764 

.2794 

.2823 

.2852 

0.8 

.2881 

.2910 

.2939 

.2967 

.2995 

.3023 

.3051 

.3078 

.3106 

.3133 

0.9 

.3159 

.3186 

.3212 

.3238 

.3264 

.3289 

.3315 

.3340 

.3365 

.3389 

1.0 

.3413 

.3438 

.3461 

.3485 

.3508 

.3531 

.3554 

.3577 

.3599 

.3621 

1.1 

.3643 

.3665 

.3686 

.3708 

.3729 

.3749 

.3770 

.3790 

.3810 

.3830 

1.2 

.3849 

.3869 

.3883 

.3907 

.3925 

.3944 

.3962 

.3980 

.3997 

.4015 

1.3 

.4032 

.4049 

.4066 

.4082 

.4099 

.4115 

.4131 

.4147 

.4162 

.4177 

1.4 

.4192 

.4207 

.4222 

.4236 

.4251 

.4265 

.4279 

.4292 

.4306 

.4319 

1.5 

.4332 

.4345 

.4357 

.4370 

.4382 

.4394 

.4406 

.4418 

.4429 

.4441 

1.6 

.4452 

.4463 

.4474 

.4484 

.4495 

.4505 

.4515 

.4525 

.4535 

.4545 

1.7 

.4554 

.4564 

.4573 

.4582 

.4591 

.4599 

.4608 

.4616 

.4625 

.4633 

1.8 

.4641 

.4649 

.4656 

.4664 

.4671 

.4678 

.4686 

.4693 

.4699 

.4706 

1.9 

.4713 

.4719 

.4726 

.4732 

.4738 

.4744 

.4750 

.4756 

.4761 

.4767 

2.0 

.4772 

.4778 

.4783 

.4788 

.4793 

.4798 

.4803 

.4808 

.4812 

.4817 

2.1 

.4821 

.4826 

.4830 

.4834 

.4838 

.4842 

.4846 

.4850 

.4854 

.4857 

2.2 

.4861 

.4864 

.4868 

.4871 

.4875 

.4878 

.4881 

.4884 

.4887 

.4890 

2.3 

.4893 

.4896 

.4898 

.4901 

.4904 

.4906 

.4909 

.4911 

.4913 

.4916 

2.4 

.4918 

.4920 

.4922 

.4925 

.4927 

.4929 

.4931 

.4932 

.4934 

.4936 

2.5 

.4938 

.4940 

.4941 

.4943 

.4945 

.4946 

.4948 

.4949 

.4951 

.4952 

2.6 

.4953 

.4955 

.4956 

.4957 

.4959 

.4960 

.4961 

.4962 

.4963 

.4964 

2.7 

.4965 

.4966 

.4967 

.4968 

.4969 

.4970 

.4971 

.4972 

.4973 

.4974 

2.8 

.4974 

.4975 

.4976 

.4977 

.4977 

.4978 

.4979 

.4979 

.4980 

.4981 

2.9 

.4981 

.4982 

.4982 

.4983 

.4984 

.4984 

.4985 

.4985 

.4986 

.4986 

3.0 

.4987 

.4987 

.4987 

.4988 

.4988 

.4989 

.4989 

.4989 

.4990 

.4990 

3.1 

.4990 

.4991 

.4991 

.4991 

.4992 

.4992 

.4992 

.4992 

.4993 

.4992 

3.2 

.4993 

.4993 

.4994 

.4994 

.4994 

.4994 

.4994 

.4995 

.4995 

.4995 

3.3 

.4995 

.4995 

.4995 

.4996 

.4996 

.4996 

.4996 

.4996 

.4996 

.4997 

3.4 

.4997 

.4997 

.4997 

.4997 

.4997 

.4997 

.4997 

.4997 

.4997 

.4998 
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TABLE  2.     Percentage  Points  of  the  t-Distribution 


d.f. 


.0005 


1 

0.3250 

0.7270 

1.376 

3.078 

6.3138 

12.706 

31.821 

63.657 

636.619 

2 

.2885 

.6172 

1.061 

1.886 

2.9200 

4.3027 

6.965 

9.9248 

31.598 

3 

.2766 

.5840 

.978 

1.638 

2.3534 

3.1825 

4.541 

5.8409 

12.924 

4 

.2707 

.5692 

.941 

1.533 

2.1318 

2.7764 

3.747 

4.6041 

8.610 

5 

.2672 

.5598 

.920 

1.476 

2.0150 

2.5706 

3.365 

4.0321 

6.869 

6 

.2648 

.5536 

.906 

1.440 

1.9432 

2.4469 

3.143 

3.7074 

5.959 

7 

.2632 

.5493 

.896 

1.415 

1.8946 

2.3646 

2.998 

3.4995 

5.408 

8 

.2619 

.5461 

.889 

1.397 

1.8595 

2.3060 

2.896 

3.3554 

5.041 

9 

.2610 

.5436 

.883 

1.383 

1.8331 

2.2622 

2.821 

3.2498 

4.781 

10 

.2602 

.5416 

.879 

1.372 

1.8125 

2.2281 

2.764 

3.1693 

4.587 

11 

.2596 

.5400 

.876 

1.363 

1.7939 

2.2010 

2.718 

3.1058 

4.437 

12 

.2590 

.5387 

.873 

1.356 

1.7823 

2.1788 

2.681 

3.0545 

4.318 

13 

.2586 

.5375 

.870 

1.350 

1.7709 

2.1604 

2.650 

3.0123 

4.221 

14 

.2582 

.5366 

.868 

1.345 

1.7613 

2.1448 

2.624 

2.9768 

4.140 

15 

.2579 

.5358 

.866 

1.341 

1.7530 

2.1315 

2.602 

2.9467 

4.073 

16 

.2576 

.5351 

.865 

1.337 

1.7459 

2.1199 

2.583 

2.9208 

4.015 

17 

.2574 

.5344 

.863 

1.333 

1.7396 

2.1098 

2.567 

2.8982 

3.965 

18 

.2571 

.5338 

.862 

1.330 

1.7341 

2.1009 

2.552 

2.8784 

3.922 

19 

.2569 

.5333 

.861 

1.328 

1.7291 

2.0930 

2.539 

2.8609 

3.883 

20 

.2567 

.5329 

.860 

1.325 

1.7247 

2.0860 

2.528 

2.8453 

3.850 

21 

.2566 

.5325 

.859 

1.323 

1.7207 

2.0796 

2.518 

2.8314 

3.819 

22 

.2564 

.5321 

.858 

1.321 

1.7171 

2.0739 

2.508 

2.8188 

3.792 

23 

.2563 

.5318 

.858 

1.319 

1.7139 

2.0687 

2.500 

2.9073 

3.767 

24 

.2562 

.5315 

.857 

1.318 

1.7109 

2.0639 

2.492 

2.7969 

3.745 

25 

.2561 

.5312 

.856 

1.316 

1.7081 

2.0595 

2.485 

2.7874 

3.725 

26 

.2560 

.5309 

.856 

1.315 

1.7056 

2.0555 

2.479 

2.7787 

3.707 

27 

.2559 

.5307 

.855 

1.314 

1.7033 

2.0518 

2.473 

2.7707 

3.690 

28 

.2558 

.5304 

.855 

1.313 

1.7011 

2.0484 

2.467 

2.7633 

3.674 

29 

.2557 

.5302 

.854 

1.311 

1.6991 

2.0452 

2.462 

2.7564 

3.659 

30 

.2556 

.5300 

.854 

1.310 

1.6973 

2.0423 

2.457 

2.7500 

3.616 

35 

.2553 

.5292 

.8521 

1.3062 

1.6896 

2.0301 

2.438 

2.7239 

3.5919 

40 

.2550 

.5286 

.8507 

1.3031 

1.6839 

2.0211 

2.423 

2.7045 

3.5511 

45 

.2549 

.5281 

.8497 

1.3007 

1.6794 

2.0141 

2.412 

2.6896 

3.5207 

50 

.2547 

.5278 

.8489 

1.2987 

1.6759 

2.0086 

2.403 

2.6778 

3.4965 

60 

.2545 

.5272 

.8477 

1.2959 

1.6707 

2.0003 

2.390 

2.6603 

3.4606 

70 

.2543 

.5268 

.8468 

1.2938 

1.6669 

1.9945 

2.381 

2.6480 

3.4355 

80 

.2542 

.5265 

.8462 

1.2922 

1.6641 

1.9901 

2.374 

2.6388 

3.4169 

90 

.2541 

.5263 

.8457 

1.2910 

1.6620 

1.9867 

2.368 

2.6316 

3.4022 

100 

.2540 

.5261 

.8452 

1.2901 

1.6602 

1.9840 

2.364 

2.6260 

3.3909 

120 

.2539 

.5258 

.8446 

1.2887 

1.6577 

1.9799 

2.358 

2.6175 

3.3736 

140 

.2538 

.5256 

.8442 

1.2876 

1.6558 

1.9771 

2.353 

2.6114 

3.3615 

160 

.2538 

.5255 

.8439 

1.2869 

1.6545 

1.9749 

2.350 

2.6070 

3.3527 

180 

.2537 

.5253 

.8436 

1.2863 

1.6534 

1.9733 

2.347 

2.6035 

3.3456 

200 

.2537 

.5252 

.8434 

1.2858 

1.6525 

1.9719 

2.345 

2.6006 

3.3400 

x 

.2533 

.5244 

.8416 

1.2816 

1.6449 

1.9600 

2.326 

2.5758 

3.2905 

SOURCE:  Reproduced  from  Documenta  Geigy  Scientific  Tables,  7th  edition,  by  permission  of  CIBA-GEIGY 
Limited,  Basle,  Switzerland. 


94 


TABLE  3.     Percentage  Points  of  the  x2-Distribution 


d.f. 

X2.995 

X2.975 

X2.9 

X2.5 

X2.. 

X2.05 

X2.025 

X2.oi 

X2.005 

X2.ooi 

d.f. 

1 

0.000 

0.000 

0.016 

0.455 

2.706 

3.841 

5.024 

6.635 

7.879 

10.828 

1 

2 

0.010 

0.051 

0.211 

1.386 

4.605 

5.991 

7.378 

9.210 

10.597 

13.816 

2 

3 

0.072 

0.216 

0.584 

2.366 

6.251 

7.815 

9.348 

11.345 

12.838 

16.266 

3 

4 

0.207 

0.484 

1.064 

3.357 

7.779 

9.488 

11.143 

13.277 

14.860 

18.467 

4 

5 

0.412 

0.831 

1.610 

4.351 

9.236 

11.070 

12.832 

15.086 

16.750 

20.515 

5 

6 

0.676 

1.237 

2.204 

5.348 

10.645 

12.592 

14.449 

16.812 

18.548 

22.458 

6 

7 

0.989 

1.690 

2.833 

6.346 

12.017 

14.067 

16.013 

18.475 

20.278 

24.322 

7 

8 

1.344 

2.180 

3.490 

7.344 

13.362 

15.507 

17.535 

20.090 

21.955 

26.124 

8 

9 

1.735 

2.700 

4.168 

8.343 

14.684 

16.919 

19.023 

21.666 

23.589 

27.877 

9 

10 

2.156 

3.247 

4.865 

9.342 

15.987 

18.307 

20.483 

23.209 

25.188 

29.588 

10 

11 

2.603 

3.816 

5.578 

10.341 

17.275 

19.675 

21.920 

24.725 

26.757 

31.264 

11 

12 

3.074 

4.404 

6.304 

11.340 

18.549 

21.026 

23.337 

26.217 

28.300 

32.910 

12 

13 

3.565 

5.009 

7.042 

12.340 

19.812 

22.362 

24.736 

27.688 

29.819 

34.528 

13 

14 

4.075 

5.629 

7.790 

13.339 

21.064 

23.685 

26.119 

29.141 

31.319 

36.123 

14 

15 

4.601 

6.262 

8.547 

14.339 

22.307 

24.996 

27.488 

30.578 

32.801 

37.697 

15 

16 

5.142 

6.908 

9.312 

15.338 

23.542 

26.296 

28.845 

32.000 

34.267 

39.252 

16 

17 

5.697 

7.564 

10.085 

16.338 

24.769 

27.587 

30.191 

33.409 

35.718 

40.790 

17 

18 

6.265 

8.231 

10.865 

17.338 

25.989 

28.869 

31.526 

34.805 

37.156 

42.312 

18 

19 

6.844 

8.907 

11.651 

18.338 

27.204 

30.144 

32.852 

36.191 

*  38.582 

43.820 

19 

20 

7.434 

9.591 

12.443 

19.337 

28.412 

31.410 

34.170 

37.566 

39.997 

45.315 

20 

21 

8.034 

10.283 

13.240 

20.337 

29.615 

32.670 

35.479 

38.932 

41.401 

46.797 

21 

22 

8.643 

10.982 

14.042 

21.337 

30.813 

33.924 

36.781 

40.289 

42.796 

48.268 

22 

23 

9.260 

11.688 

14.848 

22.337 

32.007 

35.172 

38.076 

41.638 

44.181 

49.728 

23 

24 

9.886 

12.401 

15.659 

23.337 

33.196 

36.415 

39.364 

42.980 

45.558 

51.179 

24 

25 

10.520 

13.120 

16.473 

24.337 

34.382 

37.652 

40.646 

44.314 

46.928 

52.620 

25 

26 

11.160 

13.844 

17.292 

25.336 

35.563 

38.885 

41.923 

45.642 

48.290 

54.052 

26 

27 

11.808 

14.573 

18.114 

26.336 

36.741 

40.113 

43.194 

46.963 

49.645 

55.476 

27 

28 

12.461 

15.308 

18.939 

27.336 

37.916 

41.337 

44.461 

48.278 

50.993 

56.892 

28 

29 

13.121 

16.047 

19.768 

28.336 

39.088 

42.557 

45.722 

49.588 

52.336 

58.301 

29 

30 

13.787 

16.791 

20.599 

29.336 

40.256 

43.773 

46.979 

50.892 

53.672 

59.703 

30 

31 

14.458 

17.539 

21.434 

30.336 

41.422 

44.985 

48.232 

52.191 

55.003 

61.098 

31 

32 

15.134 

18.291 

22.271 

31.336 

42.585 

46.194 

49.480 

53.486 

56.329 

62.487 

32 

33 

15.815 

19.047 

23.110 

32.336 

43.745 

47.400 

50.725 

54.776 

57.649 

63.870 

33 

34 

16.501 

19.806 

23.952 

33.336 

44.903 

48.602 

51.966 

56.061 

58.964 

65.247 

34 

35 

17.192 

20.569 

24.797 

34.336 

46.059 

49.802 

53.203 

57.342 

60.275 

66.619 

35 

36 

17.887 

21.336 

25.643 

35.336 

47.212 

50.998 

54.437 

58.619 

61.582 

67.985 

36 

37 

18.586 

22.106 

26.492 

36.335 

48.363 

52.192 

55.668 

59.892 

62.884 

69.346 

37 

38 

19.289 

22.878 

27.343 

37.335 

49.513 

53.384 

56.896 

61.162 

64.182 

70.703 

38 

39 

19.996 

23.654 

28.196 

38.335 

50.660 

54.572 

58.120 

62.428 

65.476 

72.055 

39 

40 

20.707 

24.433 

29.051 

39.335 

51.805 

55.758 

59.342 

63.691 

66.766 

73.402 

40 

41 

21.421 

25.215 

29.907 

40.335 

52.949 

56.942 

60.561 

64.950 

68.053 

74.745 

41 

42 

22.138 

25.999 

30.765 

41.335 

54.090 

58.124 

61.777 

66.206 

69.336 

76.084 

42 

43 

22.859 

26.785 

31.625 

42.335 

55.230 

59.304 

62.990 

67.459 

70.616 

77.419 

43 

44 

23.584 

27.575 

32.487 

43.335 

56.369 

60.481 

64.202 

68.710 

71.893 

78.750 

44 

45 

24.311 

28.366 

33.350 

44.335 

57.505 

61.656 

65.410 

69.957 

73.166 

80.077 

45 

46 

25.042 

29.160 

34.215 

45.335 

58.641 

62.830 

66.617 

71.201 

74.437 

81.400 

46 

47 

25.775 

29.956 

35.081 

46.335 

59.774 

64.001 

67.821 

72.443 

75.704 

82.720 

47 

48 

26.511 

30.755 

35.949 

47.335 

60.907 

65.171 

69.023 

73.683 

76.969 

84.037 

48 

49 

27.249 

31.555 

36.818 

48.335 

62.038 

66.339 

70.222 

74.919 

78.231 

85.351 

49 

50 

27.991 

32.357 

37.689 

49.335 

63.167 

67.505 

71.420 

76.154 

79.490 

86.661 

50 

SOURCE:  Abridged  with  permission  from  Biometrika  Tables  for  Statisticians,  Vol.  1 .  Edited  by  E.  S .  Pearson 
and  H.O.  Hartley,  Cambridge  University  Press  (1966). 
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TABLE  4.     Random  Numbers 


93108 

77033 

68325 

10160 

38667 

62441 

87023 

94372 

06164 

30700 

28271 

08589 

83279 

48838 

60935 

70541 

53814 

95588 

05832 

80235 

21841 

35545 

11148 

34775 

17308 

88034 

97765 

35959 

52843 

44895 

22025 

79554 

19698 

25255 

50283 

94037 

57463 

92925 

12042 

91414 

09210 

20779 

02994 

02258 

86978 

85092 

54052 

18354 

20914 

28460 

90552 

71129 

03621 

20517 

16908 

06668 

29916 

51537 

93658 

29525 

01130 

06995 

20258 

10351 

99248 

51660 

38861 

49668 

74742 

47181 

22604 

56719 

21784 

68788 

38358 

59827 

19270 

99287 

81193 

43366 

06690 

01800 

34272 

65497 

94891 

14537 

91358 

21587 

95765 

72605 

59809 

69982 

71809 

64984 

48709 

43991 

24987 

69246 

86400 

29559 

56475 

02726 

58511 

95405 

70293 

84971 

06676 

44075 

32338 

31980 

02730 

34870 

83209 

03138 

07715 

31557 

55242 

61308 

26507 

06186 

74482 

33990 

13509 

92588 

10462 

76546 

46097 

01825 

20153 

36271 

19793 

22487 

94238 

81054 

95488 

23617 

15539 

94335 

73822 

93481 

19020 

27856 

60526 

24144 

98021 

60564 

46373 

86928 

52135 

74919 

69565 

60635 

65709 

77887 

42766 

86698 

14004 

94577 

27936 

47220 

69274 

23208 

61035 

84263 

15034 

28717 

76146 

22021 

23779 

98562 

83658 

14204 

09445 

41081 

49630 

34215 

89806 

40930 

97194 

21747 

78612 

51102 

66826 

40430 

54072 

62164 

68977 

95583 

11765 

81072 

14980 

74158 

78216 

38985 

60838 

82836 

42777 

85321 

90463 

11813 

63172 

28010 

29405 

91554 

75195 

51183 

65805 

87525 

35952 

83204 

71167 

37984 

52737 

06869 

38122 

95322 

41356 

19391 

96787 

64410 

78530 

56410 

19195 

34434 

83712 

50397 

80920 

15464 

81350 

18673 

98324 

03774 

07573 

67864 

06497 

20758 

83454 

22756 

83959 

96347 

55793 

30055 

08373 

32652 

02654 

75980 

02095 

87545 

88815 

80086 

05674 

34471 

61967 

91266 

38814 

44728 

32455 

17057 

08339 

93997 

15643 

22245 

07592 

22078 

73628 

60902 

41561 

54608 

41023 

98345 

66750 

19609 

70358 

03622 

64898 

82220 

69304 

46235 

97332 

64539 

42320 

74314 

50222 

82339 

51564 

42885 

50482 

98501 

02245 

88990 

73752 

73818 

15470 

04914 

24936 

65514 

56633 

72030 

30856 

85183 

97546 

02188 

46373 

21486 

28221 

08155 

23486 

66134 

88799 

49496 

32569 

52162 

38444 

42004 

78011 

16909 

94194 

79732 

47114 

23919 

36048 

93973 

82596 

28739 

86985 

58144 

65007 

08786 

14826 

04896 

40455 

36702 

38965 

56042 

80023 

28169 

04174 

65533 

52718 

55255 

33597 

47071 

55618 

51796 

71027 

46690 

08002 

45066 

02870 

60012 

22828 

96380 

35883 

15910 

17211 

42358 

14056 

55438 

98148 

35384 

00631 

95925 

19324 

31497 

88118 

06283 

84596 

72091 

53987 

01477 

75722 

36478 

07634 

63114 

27164 

15467 

03983 

09141 

60562 

65725 

80577 

01771 

61510 

17099 

28731 

41426 

18853 

41523 

14914 

76661 

10524 

20900 

65463 

83680 

05005 

11611 

64426 

59065 

06758 

02892 

93815 

69446 

75253 

51915 

97839 

75427 

90685 

60352 

96288 

34248 

81867 

97119 

93446 

20862 

46591 

97677 

42704 

13718 

44975 

67145 

64649 

07689 

16711 

12169 

15238 

74106 

60655 

56289 

74166 

78561 

55768 

09210 

52439 

33355 

57884 

36791 

00853 

49969 

74814 

09270 

38080 

49460 

48137 

61589 

42742 

92035 

21766 

19435 

92579 

27683 

22360 

16332 

05343 

34613 

24013 

98831 

17157 

44089 

07366 

66196 

40521 

09057 

00239 

51284 

71556 

22605 

41293 

54854 

39736 

05113 

19292 

69862 

59951 

49644 

53486 

28244 

20714 

56030 

39292 

45166 

79504 

40078 

06838 

05509 

68581 

39400 

85615 

52314 

83202 

40313 

64138 

27983 

84048 

42631 

58658 

62243 

82572 

45211 

37060 

15017 

SOURCE:  Abstracted  with  permission  from  A  Million  Random  Digits  with  100,000  Normal  Deviates,  The 
Rand  Corporation,  Santa  Maria,  Calif. 
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